Leveraging Bidirectionl LSTM with CRFs for Pashto Tagging

7Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Part-of-speech tagging plays a vital role in text processing and natural language understanding. Very few attempts have been made in the past for tagging Pashto Part-of-Speech. In this work, we present a Long Short-Term Memory-based approach for Pashto part-of-speech tagging with special focus on ambiguity resolution. Initially, we created a corpus of Pashto sentences having words with multiple meanings and their tags. We introduce a powerful sentences representation and new architecture for Pashto text processing. The accuracy of the proposed approach is compared with state-of-The-Art Hidden Markov Model. Our Model shows 87.60% accuracy for all words excluding punctuation and 95.45% for ambiguous words; however, Hidden Markov Model shows 78.37% and 44.72% accuracy, respectively. Results show that our approach outperforms Hidden Markov Model in Part-of-Speech tagging for Pashto text.

Cite

CITATION STYLE

APA

Zaman, F., Maqbool, O., & Kanwal, J. (2024). Leveraging Bidirectionl LSTM with CRFs for Pashto Tagging. ACM Transactions on Asian and Low-Resource Language Information Processing, 23(4). https://doi.org/10.1145/3649456

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free