Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval

9Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In monolingual dense retrieval, lots of works focus on how to distill knowledge from cross-encoder re-ranker to dual-encoder retriever and these methods achieve better performance due to the effectiveness of cross-encoder re-ranker. However, we find that the performance of the cross-encoder re-ranker is heavily influenced by the number of training samples and the quality of negative samples, which is hard to obtain in the cross-lingual setting. In this paper, we propose to use a query generator as the teacher in the cross-lingual setting, which is less dependent on enough training samples and high-quality negative samples. In addition to traditional knowledge distillation, we further propose a novel enhancement method, which uses the query generator to help the dual-encoder align queries from different languages, but does not need any additional parallel sentences. The experimental results show that our method outperforms the state-of-the-art methods on two benchmark datasets.

Cite

CITATION STYLE

APA

Ren, H., Shou, L., Wu, N., Gong, M., & Jiang, D. (2022). Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 3107–3121). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.203

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free