This paper proposes and evaluates a nearest-neighbor method to substitute missing values in ordinal/continuous datasets. In a nutshell, the K-Means clustering algorithm is applied in the complete dataset (without missing values) before the imputation process by nearest-neighbors takes place. Then, the achieved cluster centroids are employed as training instances for the nearest-neighbor method. The proposed method is more efficient than the traditional nearest-neighbor method, and simulations performed in three benchmark datasets also indicate that it provides suitable imputations, both in terms of prediction and classification tasks. © Springer-Verlag Berlin Heidelberg 2004.
CITATION STYLE
Hruschka, E. R., Hruschka, E. R., & Ebecken, N. F. F. (2004). Towards efficient imputation by nearest-neighbors: A clustering-based approach. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3339, pp. 513–525). Springer Verlag. https://doi.org/10.1007/978-3-540-30549-1_45
Mendeley helps you to discover research relevant for your work.