Fine-grained POS tagging of spoken tunisian dialect corpora

Rahma Boujelbane; Mariem Mallek; Mariem Ellouze; Lamia Hadrich Belguith

Conference Proceedings

Fine-grained POS tagging of spoken tunisian dialect corpora

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8455 LNCS 59-62

DOI: 10.1007/978-3-319-07983-7_9

11Citations

8Readers

Get full text

Abstract

Arabic Dialects (AD) have recently begun to receive more attention from the speech science and technology communities. The use of dialects in language technologies will contribute to improve the development process and the usability of applications such speech recognition, speech comprehension, or speech synthesis. However, AD faces the problem of lack of resources compared to the Modern Standard Arabic (MSA). This paper deals with the problem of tagging an AD: The Tunisian Dialect (TD). We present, in this work, a method for building a fine grained POS (Part Of Speech tagger) for the TD. This method consists on adapting a MSA POS tagger by generating a training TD corpus from a MSA corpus using a bilingual lexicon MSA-TD. The evaluation of the TD tagger on a corpus of text transcriptions achieved an accuracy of 78.5%. © Springer International Publishing Switzerland 2014.

Author supplied keywords

Cite

CITATION STYLE

APA

Boujelbane, R., Mallek, M., Ellouze, M., & Belguith, L. H. (2014). Fine-grained POS tagging of spoken tunisian dialect corpora. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8455 LNCS, pp. 59–62). Springer Verlag. https://doi.org/10.1007/978-3-319-07983-7_9

Fine-grained POS tagging of spoken tunisian dialect corpora

Abstract

Author supplied keywords

Cite

Register to see more suggestions