Abstract
In this paper, we present a multimodal architecture to determine the emotion expressed in a meme. This architecture utilizes both textual and visual information present in a meme. To extract image features we experimented with pre-trained VGG-16 and Inception-V3 classifiers and to extract text features we used LSTM and BERT classifiers. Both FastText and GloVe embeddings were experimented with for the LSTM classifier. The best F1 scores our classifier obtained on the official analysis results are 0.3309, 0.4752, and 0.2897 for Task A, B, and C respectively in the Memotion Analysis task (Task 8) organized as part of International Workshop on Semantic Evaluation 2020 (SemEval 2020). In our study, we found that combining both textual and visual information expressed in a meme improves the performance of the classifier as opposed to using standalone classifiers that use only text or visual data.
Cite
CITATION STYLE
Baruah, A., Das, K. A., Barbhuiya, F. A., & Dey, K. (2020). IIITG-ADBU at SemEval-2020 Task 8: A Multimodal Approach to Detect Offensive, Sarcastic and Humorous Memes. In 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings (pp. 885–890). International Committee for Computational Linguistics. https://doi.org/10.18653/v1/2020.semeval-1.112
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.