Bias Mitigation for Toxicity Detection via Sequential Decisions

8Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Increased social media use has contributed to the greater prevalence of abusive, rude, and offensive textual comments. Machine learning models have been developed to detect toxic comments online, yet these models tend to show biases against users with marginalized or minority identities (e.g., females and African Americans). Established research in debiasing toxicity classifiers often (1) takes a static or batch approach, assuming that all information is available and then making a one-time decision; and (2) uses a generic strategy to mitigate different biases (e.g., gender and racial biases) that assumes the biases are independent of one another. However, in real scenarios, the input typically arrives as a sequence of comments/words over time instead of all at once. Thus, decisions based on partial information must be made while additional input is arriving. Moreover, social bias is complex by nature. Each type of bias is defined within its unique context, which, consistent with intersectionality theory within the social sciences, might be correlated with the contexts of other forms of bias. In this work, we consider debiasing toxicity detection as a sequential decision-making process where different biases can be interdependent. In particular, we study debiasing toxicity detection with two aims: (1) to examine whether different biases tend to correlate with each other; and (2) to investigate how to jointly mitigate these correlated biases in an interactive manner to minimize the total amount of bias. At the core of our approach is a framework built upon theories of sequential Markov Decision Processes that seeks to maximize the prediction accuracy and minimize the bias measures tailored to individual biases. Evaluations on two benchmark datasets empirically validate the hypothesis that biases tend to be correlated and corroborate the effectiveness of the proposed sequential debiasing strategy.

Cite

CITATION STYLE

APA

Cheng, L., Mosallanezhad, A., Silva, Y. N., Hall, D. L., & Liu, H. (2022). Bias Mitigation for Toxicity Detection via Sequential Decisions. In SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1750–1760). Association for Computing Machinery, Inc. https://doi.org/10.1145/3477495.3531945

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free