Fast near neighbor search in high-dimensional binary data

Anshumali Shrivastava; Ping Li

Conference ProceedingsOPEN ACCESS

Fast near neighbor search in high-dimensional binary data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7523 LNAI(PART 1) 474-489

DOI: 10.1007/978-3-642-33460-3_36

24Citations

27Readers

Abstract

Numerous applications in search, databases, machine learning, and computer vision, can benefit from efficient algorithms for near neighbor search. This paper proposes a simple framework for fast near neighbor search in high-dimensional binary data, which are common in practice (e.g., text). We develop a very simple and effective strategy for sub-linear time near neighbor search, by creating hash tables directly using the bits generated by b-bit minwise hashing. The advantages of our method are demonstrated through thorough comparisons with two strong baselines: spectral hashing and sign (1-bit) random projections. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Shrivastava, A., & Li, P. (2012). Fast near neighbor search in high-dimensional binary data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7523 LNAI, pp. 474–489). https://doi.org/10.1007/978-3-642-33460-3_36

Fast near neighbor search in high-dimensional binary data

Abstract

Cite

Register to see more suggestions