Socially Responsible Hate Speech Detection: Can Classifiers Reflect Social Stereotypes?

Francielle Vargas; Isabelle Carvalho; Ali Hürriyetoǧlu; Thiago A.S. Pardo; Fabrício Benevenuto

Conference ProceedingsOPEN ACCESS

Socially Responsible Hate Speech Detection: Can Classifiers Reflect Social Stereotypes?

International Conference Recent Advances in Natural Language Processing, RANLP (2023) 1187-1196

DOI: 10.26615/978-954-452-092-2_126

3Citations

9Readers

Abstract

Recent studies have shown that hate speech technologies may propagate social stereotypes against marginalized groups. Nevertheless, there has been a lack of realistic approaches to assess and mitigate biased technologies. In this paper, we introduce a new approach to analyze the potential of hate-speech classifiers to reflect social stereotypes through the investigation of stereotypical beliefs by contrasting them with counter-stereotypes. We empirically measure the distribution of stereotypical beliefs by analyzing the distinctive classification of tuples containing stereotypes versus counterstereotypes in machine learning models and datasets. Experiment results show that hate speech classifiers attribute unreal or negligent offensiveness to social identity groups by reflecting and reinforcing stereotypical beliefs regarding minorities. Furthermore, we also found out that models that embed expert and context information from offensiveness markers present promising results to mitigate social stereotype bias towards socially responsible hate speech detection.

Cite

CITATION STYLE

APA

Vargas, F., Carvalho, I., Hürriyetoǧlu, A., Pardo, T. A. S., & Benevenuto, F. (2023). Socially Responsible Hate Speech Detection: Can Classifiers Reflect Social Stereotypes? In International Conference Recent Advances in Natural Language Processing, RANLP (pp. 1187–1196). Incoma Ltd. https://doi.org/10.26615/978-954-452-092-2_126

Socially Responsible Hate Speech Detection: Can Classifiers Reflect Social Stereotypes?

Abstract

Cite

Register to see more suggestions