Arabic Native Language Identification

31Citations
Citations of this article
104Readers
Mendeley users who have this article in their library.

Abstract

In this paper we present the first application of Native Language Identification (NLI) to Arabic learner data. NLI, the task of predicting a writer’s first language from their writing in other languages has been mostly investigated with English data, but is now expanding to other languages. We use L2 texts from the newly released Arabic Learner Corpus and with a combination of three syntactic features (CFG production rules, Arabic function words and Part-of-Speech n-grams), we demonstrate that they are useful for this task. Our system achieves an accuracy of 41% against a baseline of 23%, providing the first evidence for classifier-based detection of language transfer effects in L2 Arabic. Such methods can be useful for studying language transfer, developing teaching materials tailored to students’ native language and forensic linguistics. Future directions are discussed.

Cite

CITATION STYLE

APA

Malmasi, S., & Dras, M. (2014). Arabic Native Language Identification. In ANLP 2014 - EMNLP 2014 Workshop on Arabic Natural Language Processing, Proceedings (pp. 180–186). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-3625

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free