An Interpretable Approach to Hateful Meme Detection

Tanvi Deshpande; Nitya Mani

Conference ProceedingsOPEN ACCESS

An Interpretable Approach to Hateful Meme Detection

ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction (2021) 723-727

DOI: 10.1145/3462244.3479949

9Citations

23Readers

Get full text

Abstract

Hateful memes are an emerging method of spreading hate on the internet, relying on both images and text to convey a hateful message. We take an interpretable approach to hateful meme detection, using machine learning and simple heuristics to identify the features most important to classifying a meme as hateful. In the process, we build a gradient-boosted decision tree and an LSTM-based model that achieve comparable performance (73.8 validation and 72.7 test auROC) to the gold standard of humans and state-of-the-art transformer models on this challenging task.

Author supplied keywords

Cite

CITATION STYLE

APA

Deshpande, T., & Mani, N. (2021). An Interpretable Approach to Hateful Meme Detection. In ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction (pp. 723–727). Association for Computing Machinery, Inc. https://doi.org/10.1145/3462244.3479949

An Interpretable Approach to Hateful Meme Detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions