Joint Learning of Deep Retrieval Model and Product Quantization based Embedding Index

23Citations
Citations of this article
32Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Embedding index that enables fast approximate nearest neighbor(ANN) search, serves as an indispensable component for state-of-the-art deep retrieval systems. Traditional approaches, often separating the two steps of embedding learning and index building, incur additional indexing time and decayed retrieval accuracy. In this paper, we propose a novel method called Poeem, which stands for product quantization based embedding index jointly trained with deep retrieval model, to unify the two separate steps within an end-to-end training, by utilizing a few techniques including the gradient straight-through estimator, warm start strategy, optimal space decomposition and Givens rotation. Extensive experimental results show that the proposed method not only improves retrieval accuracy significantly but also reduces the indexing time to almost none. We have open sourced our approach for the sake of comparison and reproducibility.

Cite

CITATION STYLE

APA

Zhang, H., Shen, H., Qiu, Y., Jiang, Y., Wang, S., Xu, S., … Yang, W. Y. (2021). Joint Learning of Deep Retrieval Model and Product Quantization based Embedding Index. In SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1718–1722). Association for Computing Machinery, Inc. https://doi.org/10.1145/3404835.3462988

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free