We present a novel multi-task modeling approach to learning multilingual distributed representations of text. Our system learns word and sentence embeddings jointly by training a multilingual skip-gram model together with a cross-lingual sentence similarity model. Our architecture can transparently use both monolingual and sentence aligned bilingual corpora to learn multilingual embeddings, thus covering a vocabulary significantly larger than the vocabulary of the bilingual corpora alone. Our model shows competitive performance in a standard crosslingual document classification task. We also show the effectiveness of our method in a limited resource scenario.
CITATION STYLE
Singla, K., Can, D., & Narayanan, S. (2018). A multi-task approach to learning multilingual representations. In ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (Vol. 2, pp. 214–220). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p18-2035
Mendeley helps you to discover research relevant for your work.