Abstract
Handling missing values is often an unavoidable problem. Imputation is a preferred option in handling missing values compared to removing all row records which will reduce the number of datasets and can lead to poor research results if the size of the remaining data is too small. The problem that often occurs is that there are often wrong conclusions due to some records that have missing values, therefore this study will test several simple imputation methods, namely statistical-based imputation and kNNI. The results of testing the error value with RMSE and MAPE show that kNNI imputation results are much better than statistical-based imputation. Based on the standard used in the MAPE test, the kNNI test results (error values) are almost entirely very good because the error value is <10% except for three test results in dataset 1 at k=10, k=15 and k=20, while the statistical-based imputation results are only good because the error value is between 10% and 20%, even one of the results exceeds 20% Although kNNI is better than statistical-based imputation, it is necessary to choose the right k value to get the best imputation results.
Author supplied keywords
Cite
CITATION STYLE
Fadlil, A., Herman, & Dikky Praseptian, M. (2023). Single Imputation Using Statistics-Based and K Nearest Neighbor Methods for Numerical Datasets. Ingenierie Des Systemes d’Information, 28(2), 451–459. https://doi.org/10.18280/isi.280221
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.