BERT-Based Sentiment Analysis for Low-Resourced Languages: A Case Study of Urdu Language

Muhammad Rehan Ashraf; Yasmeen Jana; Qasim Umer; M. Arfan Jaffar; Sungwook Chung; Waheed Yousuf Ramay

Journal ArticleOPEN ACCESS

BERT-Based Sentiment Analysis for Low-Resourced Languages: A Case Study of Urdu Language

IEEE Access (2023) 11 110245-110259

DOI: 10.1109/ACCESS.2023.3322101

20Citations

53Readers

Abstract

Sentiment analysis holds significant importance in research projects by providing valuable insights into public opinions. However, the majority of sentiment analysis studies focus on the English language, leaving a gap in research for other low-resourced languages or regional languages, e.g., Persian, Pashto, and Urdu. Moreover, computational linguists face the challenge of developing lexical resources for these languages. In light of this, this paper presents a deep learning-based approach for Urdu Text Sentiment Analysis (USA-BERT), leveraging Bidirectional Encoder Representations from Transformers and introduces an Urdu Dataset for Sentiment Analysis-23 (UDSA-23). USA-BERT first preprocesses the Urdu reviews by exploiting BERT-Tokenizer. Second, it creates BERT embeddings for each Urdu review. Third, given the BERT embeddings, it fine-tunes a deep learning classifier (BERT). Finally, it employs the Pareto principle on two datasets (the state-of-the-art (UCSA-21) and UDSA-23) to assess USA-BERT. The assessment results demonstrate that USA-BERT significantly surpasses the existing methods by improving the accuracy and f-measure up to 26.09% and 25.87%, respectively.

Author supplied keywords

Cite

CITATION STYLE

APA

Ashraf, M. R., Jana, Y., Umer, Q., Jaffar, M. A., Chung, S., & Ramay, W. Y. (2023). BERT-Based Sentiment Analysis for Low-Resourced Languages: A Case Study of Urdu Language. IEEE Access, 11, 110245–110259. https://doi.org/10.1109/ACCESS.2023.3322101

BERT-Based Sentiment Analysis for Low-Resourced Languages: A Case Study of Urdu Language

Abstract

Author supplied keywords

Cite

Register to see more suggestions