Multilingual universal sentence encoder for semantic retrieval

193Citations
Citations of this article
368Readers
Mendeley users who have this article in their library.

Abstract

We present easy-to-use retrieval focused multilingual sentence embedding models, made available on TensorFlow Hub. The models embed text from 16 languages into a shared semantic space using a multi-task trained dual-encoder that learns tied cross-lingual representations via translation bridge tasks (Chidambaram et al., 2018). The models achieve a new state-of-the-art in performance on monolingual and cross-lingual semantic retrieval (SR). Competitive performance is obtained on the related tasks of translation pair bitext retrieval (BR) and retrieval question answering (ReQA). On transfer learning tasks, our multilingual embeddings approach, and in some cases exceed, the performance of English only sentence embeddings.

Cite

CITATION STYLE

APA

Yang, Y., Cer, D., Ahmad, A., Guo, M., Law, J., Constant, N., … Kurzweil, R. (2020). Multilingual universal sentence encoder for semantic retrieval. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 87–94). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-demos.12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free