K nearest neighbours with mutual information for simultaneous classification and missing data imputation

Pedro J. García-Laencina; José Luis Sancho-Gómez; Aníbal R. Figueiras-Vidal; Michel Verleysen

Journal Article

K nearest neighbours with mutual information for simultaneous classification and missing data imputation

Neurocomputing (2009) 72(7-9) 1483-1493

DOI: 10.1016/j.neucom.2008.11.026

203Citations

168Readers

Get full text

Abstract

Missing data is a common drawback in many real-life pattern classification scenarios. One of the most popular solutions is missing data imputation by the K nearest neighbours (K NN) algorithm. In this article, we propose a novel K NN imputation procedure using a feature-weighted distance metric based on mutual information (MI). This method provides a missing data estimation aimed at solving the classification task, i.e., it provides an imputed dataset which is directed toward improving the classification performance. The MI-based distance metric is also used to implement an effective K NN classifier. Experimental results on both artificial and real classification datasets are provided to illustrate the efficiency and the robustness of the proposed algorithm. © 2009 Elsevier B.V. All rights reserved.

Author supplied keywords

Cite

CITATION STYLE

APA

García-Laencina, P. J., Sancho-Gómez, J. L., Figueiras-Vidal, A. R., & Verleysen, M. (2009). K nearest neighbours with mutual information for simultaneous classification and missing data imputation. Neurocomputing, 72(7–9), 1483–1493. https://doi.org/10.1016/j.neucom.2008.11.026

K nearest neighbours with mutual information for simultaneous classification and missing data imputation

Abstract

Author supplied keywords

Cite

Register to see more suggestions