Abstract
Spatial data structures, for vector or metric spaces, are a well-known means to speed-up proximity queries. One of the common uses of the found neighbors of the query object is in classification methods, e.g., the famous κ-nearest neighbor algorithm. Still, most experimental works focus on providing attractive tradeoffs between neighbor search times and the neighborhood quality, but they ignore the impact of such tradeoffs on the classification accuracy. In this paper, we explore a few simple approximate and probabilistic variants of two popular spatial data structures, the k-d tree and the ball tree, with κ-NN results on real data sets. The main difference between these two structures is the location of input data -in all nodes (k-d tree), or in the leaves (ball tree) -and for this reason they act as good representatives of other spatial structures. We show that in several cases significant speedups compared to the use of such structures in the exact κ-NN classification are possible, with a moderate penalty in accuracy. We conclude that the usage of the k-d tree is a more promising approach.
Cite
CITATION STYLE
Cisłak, A., & Grabowski, S. (2014). Experimental evaluation of selected tree structures for exact and approximate κ-nearest neighbor classification. In 2014 Federated Conference on Computer Science and Information Systems, FedCSIS 2014 (Vol. 2014-January, pp. 93–100). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.15439/2014F194
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.