Improved methods for the imputation of missing data by nearest neighbor methods

  • Tutz G
  • Ramzan S
  • 17


    Mendeley users who have this article in their library.
  • 10


    Citations of this article.


Missing data raise problems in almost all fields of quantitative research. A useful nonparametric procedure is the nearest neighbor imputation method. Improved versions of this method are presented. First, a weighted nearest neighbor imputation method based on Lq distances is proposed. It is demonstrated that the method tends to have a smaller imputation error than other nearest neighbor estimates. Then weighted nearest neighbor imputation methods that use distances for selected covariates are considered. The careful selection of distances that carry information about the missing values yields an imputation tool that can outperform competing nearest neighbor methods. This approach performs well, especially when the number of predictors is large. The methods are evaluated in simulation studies and with several real data sets from different fields.

Author-supplied keywords

  • Cross-validation
  • Kernel function
  • MCAR
  • Weighted imputation
  • Weighted nearest neighbors

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free