Pos-tagging of tunisian dialect using standard arabic resources and tools

9Citations
Citations of this article
73Readers
Mendeley users who have this article in their library.

Abstract

Developing natural language processing tools usually requires a large number of resources (lexica, annotated corpora, etc.), which often do not exist for less-resourced languages. One way to overcome the problem of lack of resources is to devote substantial efforts to build new ones from scratch. Another approach is to exploit existing resources of closely related languages. In this paper, we focus on developing a part-of-speech tagger for the Tunisian Arabic dialect (TUN), a lowresource language, by exploiting its closeness to Modern Standard Arabic (MSA), which has many state-of-the-art resources and tools. Our system achieved an accuracy of 89% (∼20% absolute improvement over an MSA tagger baseline).

Cite

CITATION STYLE

APA

Hamdi, A., Nasr, A., Habash, N., & Gala, N. (2015). Pos-tagging of tunisian dialect using standard arabic resources and tools. In 2nd Workshop on Arabic Natural Language Processing, ANLP 2015 - held at 53rd Annual Meeting of the Association for Computational Linguistics, ACL 2015 - Proceedings (pp. 59–68). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-3207

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free