Automatic identification of moroccan colloquial arabic

10Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Language Identification is an NLP task which aims at predicting the language of a given text. For the Arabic dialects many attempts have been done to address this topic. In this paper, we present our approach to build a Language Identification system in order to distinguish between Moroccan Colloquial Arabic and Arabic languages using two different methods. The first is rule-based and relies on stop word frequency, while the second is statically-based and uses several machine learning classifiers. Obtained results show that the statistical approach outperforms the rule-based approach. Furthermore, the Support Vector Machines classifier is more accurate than other statistical classifiers. Our goal in this paper is to pave the way toward building advanced Moroccan dialect NLP tools such as morphological analyzer and machine translation system.

Cite

CITATION STYLE

APA

Tachicart, R., Bouzoubaa, K., Aouragh, S. L., & Jaafa, H. (2018). Automatic identification of moroccan colloquial arabic. In Communications in Computer and Information Science (Vol. 782, pp. 201–214). Springer Verlag. https://doi.org/10.1007/978-3-319-73500-9_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free