TamilEmo: Fine-grained Emotion Detection Dataset for Tamil

Charangan Vasantharajan; Ruba Priyadharshini; Prasanna Kumar Kumarasen; Rahul Ponnusamy; Sathiyaraj Thangasamy; Sean Benhur; Thenmozhi Durairaj; Kanchana Sivanraju; Anbukkarasi Sampath; Bharathi Raja Chakravarthi

Conference Proceedings

TamilEmo: Fine-grained Emotion Detection Dataset for Tamil

Communications in Computer and Information Science (2023) 1802 CCIS 35-50

DOI: 10.1007/978-3-031-33231-9_3

3Citations

10Readers

Get full text

Abstract

Emotional Analysis from textual input has been considered both a challenging and interesting task in Natural Language Processing. However, due to the lack of datasets in low-resource languages (e.g. Tamil), it is difficult to conduct research of high standards in this area. Therefore we introduce a large manually annotated dataset of more than 42k Tamil YouTube comments, labeled for 31 emotions for emotion recognition. The goal of this dataset is to improve emotion detection in multiple downstream tasks in Tamil. We have also created three different groupings of our emotions namely 3-class, 7-class, and 31-class, and evaluated the models’ performance in each category of the grouping. We ran several baselines of different models and our MuRIL model has achieved the highest macro F1 score of 0.67 across our 3-class group dataset. In 7-class and 31-class groups, the MuRIL and Random Forest models performed well with a macro F1 score of 0.52 and 0.29 respectively.

Author supplied keywords

Cite

CITATION STYLE

APA

Vasantharajan, C., Priyadharshini, R., Kumarasen, P. K., Ponnusamy, R., Thangasamy, S., Benhur, S., … Chakravarthi, B. R. (2023). TamilEmo: Fine-grained Emotion Detection Dataset for Tamil. In Communications in Computer and Information Science (Vol. 1802 CCIS, pp. 35–50). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-33231-9_3

TamilEmo: Fine-grained Emotion Detection Dataset for Tamil

Abstract

Author supplied keywords

Cite

Register to see more suggestions