Fine-grained POS tagging of spoken tunisian dialect corpora

11Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Arabic Dialects (AD) have recently begun to receive more attention from the speech science and technology communities. The use of dialects in language technologies will contribute to improve the development process and the usability of applications such speech recognition, speech comprehension, or speech synthesis. However, AD faces the problem of lack of resources compared to the Modern Standard Arabic (MSA). This paper deals with the problem of tagging an AD: The Tunisian Dialect (TD). We present, in this work, a method for building a fine grained POS (Part Of Speech tagger) for the TD. This method consists on adapting a MSA POS tagger by generating a training TD corpus from a MSA corpus using a bilingual lexicon MSA-TD. The evaluation of the TD tagger on a corpus of text transcriptions achieved an accuracy of 78.5%. © Springer International Publishing Switzerland 2014.

Cite

CITATION STYLE

APA

Boujelbane, R., Mallek, M., Ellouze, M., & Belguith, L. H. (2014). Fine-grained POS tagging of spoken tunisian dialect corpora. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8455 LNCS, pp. 59–62). Springer Verlag. https://doi.org/10.1007/978-3-319-07983-7_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free