An Interpretable Approach to Hateful Meme Detection

9Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Hateful memes are an emerging method of spreading hate on the internet, relying on both images and text to convey a hateful message. We take an interpretable approach to hateful meme detection, using machine learning and simple heuristics to identify the features most important to classifying a meme as hateful. In the process, we build a gradient-boosted decision tree and an LSTM-based model that achieve comparable performance (73.8 validation and 72.7 test auROC) to the gold standard of humans and state-of-the-art transformer models on this challenging task.

Cite

CITATION STYLE

APA

Deshpande, T., & Mani, N. (2021). An Interpretable Approach to Hateful Meme Detection. In ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction (pp. 723–727). Association for Computing Machinery, Inc. https://doi.org/10.1145/3462244.3479949

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free