Fine-Grained Fairness Analysis of Abusive Language Detection Systems with CheckList

12Citations
Citations of this article
52Readers
Mendeley users who have this article in their library.

Abstract

Current abusive language detection systems have demonstrated unintended bias towards sensitive features such as nationality or gender. This is a crucial issue, which may harm minorities and underrepresented groups if such systems were integrated in real-world applications. In this paper, we create ad hoc tests through the CheckList tool (Ribeiro et al., 2020) to detect biases within abusive language classifiers for English. We compare the behaviour of two BERT-based models, one trained on a generic abusive language dataset and the other on a dataset for misogyny detection. Our evaluation shows that, although BERT-based classifiers achieve high accuracy levels on a variety of natural language processing tasks, they perform very poorly as regards fairness and bias, in particular on samples involving implicit stereotypes, expressions of hate towards minorities and protected attributes such as race or sexual orientation. We release both the notebooks implemented to extend the Fairness tests and the synthetic datasets usable to evaluate systems bias independently of CheckList.

Cite

CITATION STYLE

APA

Manerba, M. M., & Tonelli, S. (2021). Fine-Grained Fairness Analysis of Abusive Language Detection Systems with CheckList. In WOAH 2021 - 5th Workshop on Online Abuse and Harms, Proceedings of the Workshop (pp. 81–91). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.woah-1.9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free