We introduce XED, a multilingual fine-grained emotion dataset. The dataset consists of human-annotated Finnish (25k) and English sentences (30k), as well as projected annotations for 30 additional languages, providing new resources for many low-resource languages. We use Plutchik’s core emotions to annotate the dataset with the addition of neutral to create a multilabel multiclass dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to show that XED performs on par with other similar datasets and is therefore a useful tool for sentiment analysis and emotion detection.
CITATION STYLE
Öhman, E., Pàmies, M., Kajava, K., & Tiedemann, J. (2020). XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 6542–6552). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.575
Mendeley helps you to discover research relevant for your work.