Abstract
In this study, we examine and analyze the behavior of several graph-based models for Bangla text classification tasks. Graph-based algorithms create heterogeneous graphs from text data. Each node represents either a word or a document and each edge indicates the relationship between any two words or word to document. We applied the BERT and different graph-based models including TextGCN, GAT, BertGAT, and BertGCN on five different Bangla text datasets including SentNoB, Sarcasm detection, BanFakeNews, Hate speech detection, and Emotion detection datasets. The performance with the BERT model surpassed the TextGCN and the GAT models by a large difference in terms of accuracy, Macro F1 score, and weighted F1 score. On the other hand, BertGCN and BertGAT outperformed the standalone graph models and the BERT. BertGAT excelled in the Emotion detection dataset and achieved a 1%-2% performance boost in Sarcasm detection, Hate speech detection, and BanFakeNews datasets from BERT’s performance. Whereas BertGCN outperformed BertGAT by 1% for SentNoB and BanFakeNews datasets while beating BertGAT by 2% for Sarcasm detection, Hate Speech, and Emotion detection datasets. Furthermore, We examined different variations in graph structure and analyzed their effects.
Cite
CITATION STYLE
Dehan, F. N., Fahim, M., Ali, A. A., Amin, M. A., & Mahbubur Rahman, A. K. M. (2023). Investigating the Effectiveness of Graph-based Algorithm for Bangla Text Classification. In BLP 2023 - 1st Workshop on Bangla Language Processing, Proceedings of the Workshop (pp. 7–17). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.banglalp-1.12
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.