Locality-sensitive hashing for finding nearest neighbors in probability distributions

Yi Kun Tang; Xian Ling Mao; Yi Jing Hao; Cheng Xu; Heyan Huang

Conference Proceedings

Locality-sensitive hashing for finding nearest neighbors in probability distributions

Communications in Computer and Information Science (2017) 774 3-15

DOI: 10.1007/978-981-10-6805-8_1

2Citations

4Readers

Get full text

Abstract

In the past ten years, new powerful algorithms based on efficient data structures have been proposed to solve the problem of Approximate Nearest Neighbors search (ANN). To find the nearest neighbors in probability-distribution-type data, the existing Locality Sensitive Hashing (LSH) algorithms for vector-type data can be directly used to solve it. However, these methods do not consider the special properties of probability distributions. In this paper, based on the special properties of probability distributions, we present a novel LSH scheme adapted to angular distance for ANN search in high-dimensional probability distributions. We define the specific hashing functions, and prove their local-sensitivity. Also, we propose a Sequential Interleaving algorithm based on the “Unbalance Effect” of Euclidean and angular metrics for probability distributions. Finally, we compare, through experiments, our methods with the state-of-the-art LSH algorithms in the context of ANN on six public image databases. The results prove the proposed algorithms can provide far better accuracy in the context of ANN than baselines.

Author supplied keywords

Cite

CITATION STYLE

APA

Tang, Y. K., Mao, X. L., Hao, Y. J., Xu, C., & Huang, H. (2017). Locality-sensitive hashing for finding nearest neighbors in probability distributions. In Communications in Computer and Information Science (Vol. 774, pp. 3–15). Springer Verlag. https://doi.org/10.1007/978-981-10-6805-8_1

Locality-sensitive hashing for finding nearest neighbors in probability distributions

Abstract

Author supplied keywords

Cite

Register to see more suggestions