Fine-Grained Fairness Analysis of Abusive Language Detection Systems with CheckList

Marta Marchiori Manerba; Sara Tonelli

Conference Proceedings

Fine-Grained Fairness Analysis of Abusive Language Detection Systems with CheckList

WOAH 2021 - 5th Workshop on Online Abuse and Harms, Proceedings of the Workshop (2021) 81-91

DOI: 10.18653/v1/2021.woah-1.9

14Citations

55Readers

Get full text

Abstract

Current abusive language detection systems have demonstrated unintended bias towards sensitive features such as nationality or gender. This is a crucial issue, which may harm minorities and underrepresented groups if such systems were integrated in real-world applications. In this paper, we create ad hoc tests through the CheckList tool (Ribeiro et al., 2020) to detect biases within abusive language classifiers for English. We compare the behaviour of two BERT-based models, one trained on a generic abusive language dataset and the other on a dataset for misogyny detection. Our evaluation shows that, although BERT-based classifiers achieve high accuracy levels on a variety of natural language processing tasks, they perform very poorly as regards fairness and bias, in particular on samples involving implicit stereotypes, expressions of hate towards minorities and protected attributes such as race or sexual orientation. We release both the notebooks implemented to extend the Fairness tests and the synthetic datasets usable to evaluate systems bias independently of CheckList.

Cite

CITATION STYLE

APA

Manerba, M. M., & Tonelli, S. (2021). Fine-Grained Fairness Analysis of Abusive Language Detection Systems with CheckList. In WOAH 2021 - 5th Workshop on Online Abuse and Harms, Proceedings of the Workshop (pp. 81–91). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.woah-1.9

Fine-Grained Fairness Analysis of Abusive Language Detection Systems with CheckList

Abstract

Cite

Register to see more suggestions