Deep neural networks for part-of-speech tagging in under-resourced amazigh

5Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

Abstract

Part-of-speech (POS) tagging denotes the assignment of appropriate grammatical categories to individual words within a sentence or text, playing a pivotal role in numerous natural language processing (NLP) tasks. While POS tagging in widely-used languages such as English has reached accuracy levels exceeding 97%, less-resourced languages such as Amazigh have seen limited research and therefore, less accuracy in tagging efforts. This paper aims to bridge this gap by exploring the application of deep learning models for Amazigh POS tagging, specifically focusing on various types of recurrent neural networks (RNNs) gated recurrent networks, long short-Term memory (LSTM) networks, and bidirectional LSTM networks. Despite the relatively small dataset of 60k tokens, a stark contrast to the vast corpuses available for languages with extensive resources, the proposed RNN models have demonstrated significant improvements over existing Amazigh POS taggers. Remarkably, all RNN models tested in this study outperformed traditional machine learning taggers, achieving an accuracy rate of 97%, thus presenting a promising avenue for enhanced POS tagging in under-resourced languages. This research underscores the potential of deep learning approaches in contributing to the advancement of linguistic studies in less-documented languages, such as Amazigh.

Cite

CITATION STYLE

APA

Bani, R., Amri, S., Zenkouar, L., & Guennoun, Z. (2023). Deep neural networks for part-of-speech tagging in under-resourced amazigh. Revue d’Intelligence Artificielle, 37(3), 611–617. https://doi.org/10.18280/ria.370310

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free