Numerous applications in search, databases, machine learning, and computer vision, can benefit from efficient algorithms for near neighbor search. This paper proposes a simple framework for fast near neighbor search in high-dimensional binary data, which are common in practice (e.g., text). We develop a very simple and effective strategy for sub-linear time near neighbor search, by creating hash tables directly using the bits generated by b-bit minwise hashing. The advantages of our method are demonstrated through thorough comparisons with two strong baselines: spectral hashing and sign (1-bit) random projections. © 2012 Springer-Verlag.
CITATION STYLE
Shrivastava, A., & Li, P. (2012). Fast near neighbor search in high-dimensional binary data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7523 LNAI, pp. 474–489). https://doi.org/10.1007/978-3-642-33460-3_36
Mendeley helps you to discover research relevant for your work.