In this paper, we explored different levels of textual representations for cross-lingual information retrieval. Beyond the traditional token level representation, we adopted the subword and character level representations for information retrieval that had shown to improve neural machine translation by reducing the out-of-vocabulary issues in machine translation. Additionally, we improved the search performance by combining and re-ranking the result sets from the different text representations for German, French and Japanese.
CITATION STYLE
Zhang, B., & Tan, L. (2021). Textual Representations for Crosslingual Information Retrieval. In ECNLP 2021 - 4th Workshop on e-Commerce and NLP, Proceedings (pp. 116–122). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.ecnlp-1.14
Mendeley helps you to discover research relevant for your work.