Improving low-resource cross-lingual document retrieval by reranking with deep bilingual representations

12Citations
Citations of this article
127Readers
Mendeley users who have this article in their library.

Abstract

In this paper, we propose to boost low-resource cross-lingual document retrieval performance with deep bilingual query-document representations. We match queries and documents in both source and target languages with four components, each of which is implemented as a term interaction-based deep neural network with cross-lingual word embeddings as input. By including query likelihood scores as extra features, our model effectively learns to rerank the retrieved documents by using a small number of relevance labels for low-resource language pairs. Due to the shared cross-lingual word embedding space, the model can also be directly applied to another language pair without any training label. Experimental results on the MATERIAL dataset show that our model outperforms the competitive translation-based baselines on English-Swahili, English-Tagalog, and English-Somali cross-lingual information retrieval tasks.

Cite

CITATION STYLE

APA

Zhang, R., Westerfield, C., Shim, S., Bingham, G., Fabbri, A., Hu, W., … Radev, D. (2020). Improving low-resource cross-lingual document retrieval by reranking with deep bilingual representations. In ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 3173–3179). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p19-1306

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free