A parallel corpus for evaluating machine translation between Arabic and european languages

15Citations
Citations of this article
88Readers
Mendeley users who have this article in their library.

Abstract

We present Arab-Acquis, a large publicly available dataset for evaluating machine translation between 22 European languages and Arabic. Arab-Acquis consists of over 12,000 sentences from the JRCAcquis (Acquis Communautaire) corpus translated twice by professional translators, once from English and once from French, and totaling over 600,000 words. The corpus follows previous data splits in the literature for tuning, development, and testing. We describe the corpus and how it was created. We also present the first benchmarking results on translating to and from Arabic for 22 European languages.

Cite

CITATION STYLE

APA

Habash, N., Zalmout, N., Taji, D., Hieu, H., & Alzate, M. (2017). A parallel corpus for evaluating machine translation between Arabic and european languages. In 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference (Vol. 2, pp. 235–241). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/e17-2038

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free