Multimodal emotion classification

25Citations
Citations of this article
80Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Most NLP and Computer Vision tasks are limited to scarcity of la-belled data. In social media emotion classification and other related tasks, hashtags have been used as indicators to label data. With the rapid increase in emoji usage of social media, emojis are used as an additional feature for major social NLP tasks. However, this is less explored in case of multimedia posts on social media where posts are composed of both image and text. At the same time, w.e have seen a surge in the interest to incorporate domain knowledge to improve machine understanding of text. In this paper, we investigate whether domain knowledge for emoji can improve the accuracy of emotion classification task. We exploit the importance of different modalities from social media post for emotion classification task using state-of-the-art deep learning architectures. Our experiments demonstrate that the three modalities (text, emoji and images) encode different information to express emotion and therefore can complement each other. Our results also demonstrate that emoji sense depends on the textual context, and emoji combined with text encodes better information than considered separately. The highest accuracy of 71.98% is achieved with a training data of 550k posts.

Cite

CITATION STYLE

APA

Illendula, A., & Sheth, A. (2019). Multimodal emotion classification. In The Web Conference 2019 - Companion of the World Wide Web Conference, WWW 2019 (pp. 439–449). Association for Computing Machinery, Inc. https://doi.org/10.1145/3308560.3316549

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free