Multi-lingual phrase-based statistical machine translation for Arabic-English

1Citations
Citations of this article
63Readers
Mendeley users who have this article in their library.

Abstract

In this paper, we implement a multilingual Statistical Machine Translation (SMT) system for Arabic-English Translation. Arabic Text can be categorized into standard and dialectal Arabic. These two forms of Arabic differ significantly. Different mono-lingual and multi-lingual hybrid SMT approaches are compared. Mono-lingual systems do always result in better translation accuracy in one Arabic form and poor accuracy in the other. Multi-lingual SMT models that are trained with pooled parallel MSA/dialectal data result in better accuracy. However, since the available parallel MSA data are much larger compared to dialectal data, multilingual models are biased to MSA. We propose in the work, a multi-lingual combination of different mono-lingual systems using an Arabic form classifier. The outcome of the classier directs the system to use the appropriate mono-lingual models (standard, dialectal, or mixture). Testing the different SMT systems shows that the proposed classifier-based SMT system outperforms mono-lingual and data-pooled multi-lingual systems.

Cite

CITATION STYLE

APA

Bastawisy, A., & Elmahdy, M. (2017). Multi-lingual phrase-based statistical machine translation for Arabic-English. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2017-September, pp. 86–89). Incoma Ltd. https://doi.org/10.26615/978-954-452-049-6_013

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free