Estimation by the Nearest Neighbor Rule

286Citations
Citations of this article
119Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Let R*denote the Bayes risk (minimum expected loss) for the problem of estimating 0 ε Θ, given an observed random variable x, joint probability distribution F(x,θ), and loss function L. Consider the problem in which the only knowledge of F is that which can be inferred from samples (x1,θ1), (x2,θ2),…(xn,θn), where the (xi,θi)'s are independently identically distributed according to F. Let the nearest neighbor estimate of the parameter θ associated with an observation x be defined to be the parameter θn' associated with the nearest neighbor xn' to x. Let R be the large sample risk of the nearest neighbor rule. It will be shown, for a wide range of probability distributions, that R ≤ 2R* for metric loss functions and R = 2R* for squared-error loss functions. A simple estimator using the nearest k neighbors yields R = R* (1 + 1/ k) in the squared-error loss case. In this sense, it can be said that at least half the information in the infinite training set is contained in the nearest neighbor. This paper is an extension of earlier work[4] from the problem of classification by the nearest neighbor rule to that of estimation. However, the unbounded loss functions in the estimation problem introduce additional problems concerning the convergence of the unconditional risk. Thus some work is devoted to the investigation of natural conditions on the underlying distribution assuring the desired convergence. © 1968 IEEE. All rights reserved.

Cite

CITATION STYLE

APA

Cover, T. M. (1968). Estimation by the Nearest Neighbor Rule. IEEE Transactions on Information Theory, 14(1), 50–55. https://doi.org/10.1109/TIT.1968.1054098

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free