BaNeP: An End-to-End Neural Network Based Model for Bangla Parts-of-Speech Tagging

Jesan Ahammed Ovi; Md Ashraful Islam; Md Rezaul Karim

Journal ArticleOPEN ACCESS

BaNeP: An End-to-End Neural Network Based Model for Bangla Parts-of-Speech Tagging

IEEE Access (2022) 10 102753-102769

DOI: 10.1109/ACCESS.2022.3208269

8Citations

15Readers

Abstract

In Natural Language Processing, Parts-of-Speech tagging is a vital component that significantly impacts applications like machine translation, spell-checker, information retrieval, and speech processing. In languages such as English and Dutch, POS tagging is considered a solved problem (accuracy: 97%). However, for low-resource languages like Bangla, challenges are still there. In this article, we have proposed a novel RNN-based network named BaNeP to determine parts of speech for Bangla words. The proposed network extracts structural features through a bidirectional LSTM-based sub-network, and intricate contextual relations among words of a sentence are identified through an elaborate weighted context extraction procedure. These features are then combinedly utilized to generate the final Parts-of-Speech prediction. Training the model requires only an annotated dataset vanishing the need for any hand-crafted features. Experimental results on the LDC2010T16 dataset show significant accuracy improvement compared to existing Bangla POS taggers.

Author supplied keywords

Cite

CITATION STYLE

APA

Ovi, J. A., Islam, M. A., & Karim, M. R. (2022). BaNeP: An End-to-End Neural Network Based Model for Bangla Parts-of-Speech Tagging. IEEE Access, 10, 102753–102769. https://doi.org/10.1109/ACCESS.2022.3208269

BaNeP: An End-to-End Neural Network Based Model for Bangla Parts-of-Speech Tagging

Abstract

Author supplied keywords

Cite

Register to see more suggestions