Cross-lingual learning-to-rank with shared representations

41Citations
Citations of this article
101Readers
Mendeley users who have this article in their library.

Abstract

Cross-lingual information retrieval (CLIR) is a document retrieval task where the documents are written in a language different from that of the user's query. This is a challenging problem for data-driven approaches due to the general lack of labeled training data. We introduce a large-scale dataset derived from Wikipedia to support CLIR research in 25 languages. Further, we present a simple yet effective neural learning-to-rank model that shares representations across languages and reduces the data requirement. This model can exploit training data in, for example, Japanese-English CLIR to improve the results of Swahili-English CLIR.

Cite

CITATION STYLE

APA

Sasaki, S., Sun, S., Schamoni, S., Duh, K., & Inui, K. (2018). Cross-lingual learning-to-rank with shared representations. In NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference (Vol. 2, pp. 458–463). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/n18-2073

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free