XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection

36Citations
Citations of this article
114Readers
Mendeley users who have this article in their library.

Abstract

We introduce XED, a multilingual fine-grained emotion dataset. The dataset consists of human-annotated Finnish (25k) and English sentences (30k), as well as projected annotations for 30 additional languages, providing new resources for many low-resource languages. We use Plutchik’s core emotions to annotate the dataset with the addition of neutral to create a multilabel multiclass dataset. The dataset is carefully evaluated using language-specific BERT models and SVMs to show that XED performs on par with other similar datasets and is therefore a useful tool for sentiment analysis and emotion detection.

Cite

CITATION STYLE

APA

Öhman, E., Pàmies, M., Kajava, K., & Tiedemann, J. (2020). XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 6542–6552). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.575

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free