Single Imputation Using Statistics-Based and K Nearest Neighbor Methods for Numerical Datasets

Abdul Fadlil; undefined Herman; M. Dikky Praseptian

Journal ArticleOPEN ACCESS

Single Imputation Using Statistics-Based and K Nearest Neighbor Methods for Numerical Datasets

Ingenierie des Systemes d'Information (2023) 28(2) 451-459

DOI: 10.18280/isi.280221

5Citations

18Readers

Abstract

Handling missing values is often an unavoidable problem. Imputation is a preferred option in handling missing values compared to removing all row records which will reduce the number of datasets and can lead to poor research results if the size of the remaining data is too small. The problem that often occurs is that there are often wrong conclusions due to some records that have missing values, therefore this study will test several simple imputation methods, namely statistical-based imputation and kNNI. The results of testing the error value with RMSE and MAPE show that kNNI imputation results are much better than statistical-based imputation. Based on the standard used in the MAPE test, the kNNI test results (error values) are almost entirely very good because the error value is <10% except for three test results in dataset 1 at k=10, k=15 and k=20, while the statistical-based imputation results are only good because the error value is between 10% and 20%, even one of the results exceeds 20% Although kNNI is better than statistical-based imputation, it is necessary to choose the right k value to get the best imputation results.

Author supplied keywords

Cite

CITATION STYLE

APA

Fadlil, A., Herman, & Dikky Praseptian, M. (2023). Single Imputation Using Statistics-Based and K Nearest Neighbor Methods for Numerical Datasets. Ingenierie Des Systemes d’Information, 28(2), 451–459. https://doi.org/10.18280/isi.280221

Single Imputation Using Statistics-Based and K Nearest Neighbor Methods for Numerical Datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions