BaNeP: An End-to-End Neural Network Based Model for Bangla Parts-of-Speech Tagging

8Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In Natural Language Processing, Parts-of-Speech tagging is a vital component that significantly impacts applications like machine translation, spell-checker, information retrieval, and speech processing. In languages such as English and Dutch, POS tagging is considered a solved problem (accuracy: 97%). However, for low-resource languages like Bangla, challenges are still there. In this article, we have proposed a novel RNN-based network named BaNeP to determine parts of speech for Bangla words. The proposed network extracts structural features through a bidirectional LSTM-based sub-network, and intricate contextual relations among words of a sentence are identified through an elaborate weighted context extraction procedure. These features are then combinedly utilized to generate the final Parts-of-Speech prediction. Training the model requires only an annotated dataset vanishing the need for any hand-crafted features. Experimental results on the LDC2010T16 dataset show significant accuracy improvement compared to existing Bangla POS taggers.

Cite

CITATION STYLE

APA

Ovi, J. A., Islam, M. A., & Karim, M. R. (2022). BaNeP: An End-to-End Neural Network Based Model for Bangla Parts-of-Speech Tagging. IEEE Access, 10, 102753–102769. https://doi.org/10.1109/ACCESS.2022.3208269

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free