Multi-lingual phrase-based statistical machine translation for Arabic-English

Ahmed Bastawisy; Mohamed Elmahdy

Conference ProceedingsOPEN ACCESS

Multi-lingual phrase-based statistical machine translation for Arabic-English

International Conference Recent Advances in Natural Language Processing, RANLP (2017) 2017-September 86-89

DOI: 10.26615/978-954-452-049-6_013

1Citations

63Readers

Abstract

In this paper, we implement a multilingual Statistical Machine Translation (SMT) system for Arabic-English Translation. Arabic Text can be categorized into standard and dialectal Arabic. These two forms of Arabic differ significantly. Different mono-lingual and multi-lingual hybrid SMT approaches are compared. Mono-lingual systems do always result in better translation accuracy in one Arabic form and poor accuracy in the other. Multi-lingual SMT models that are trained with pooled parallel MSA/dialectal data result in better accuracy. However, since the available parallel MSA data are much larger compared to dialectal data, multilingual models are biased to MSA. We propose in the work, a multi-lingual combination of different mono-lingual systems using an Arabic form classifier. The outcome of the classier directs the system to use the appropriate mono-lingual models (standard, dialectal, or mixture). Testing the different SMT systems shows that the proposed classifier-based SMT system outperforms mono-lingual and data-pooled multi-lingual systems.

Cite

CITATION STYLE

APA

Bastawisy, A., & Elmahdy, M. (2017). Multi-lingual phrase-based statistical machine translation for Arabic-English. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2017-September, pp. 86–89). Incoma Ltd. https://doi.org/10.26615/978-954-452-049-6_013

Multi-lingual phrase-based statistical machine translation for Arabic-English

Abstract

Cite

Register to see more suggestions