Detecting Unintended Social Bias in Toxic Language Datasets

12Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.

Abstract

Warning: This paper has contents which may be offensive, or upsetting however this cannot be avoided owing to the nature of the work. With the rise of online hate speech, automatic detection of Hate Speech, Offensive texts as a natural language processing task is getting popular. However, very little research has been done to detect unintended social bias from these toxic language datasets. This paper introduces a new dataset ToxicBias curated from the existing dataset of Kaggle competition named "Jigsaw Unintended Bias in Toxicity Classification". We aim to detect social biases, their categories, and targeted groups. The dataset contains instances annotated for five different bias categories, viz., gender, race/ethnicity, religion, political, and LGBTQ. We train transformerbased models using our curated datasets and report baseline performance for bias identification, target generation, and bias implications. Model biases and their mitigation are also discussed in detail. Our study motivates a systematic extraction of social bias data from toxic language datasets. All the codes and dataset used for experiments in this work are publicly available.

References Powered by Scopus

GloVe: Global vectors for word representation

26958Citations
N/AReaders
Get full text

Semantics derived automatically from language corpora contain human-like biases

1878Citations
N/AReaders
Get full text

Hateful symbols or hateful people? predictive features for hate speech detection on twitter

1337Citations
N/AReaders
Get full text

Cited by Powered by Scopus

With Prejudice to None: A Few-Shot, Multilingual Transfer Learning Approach to Detect Social Bias in Low Resource Languages

6Citations
N/AReaders
Get full text

Socially Responsible Hate Speech Detection: Can Classifiers Reflect Social Stereotypes?

5Citations
N/AReaders
Get full text

Classification of Toxic Comments on Social Networks Using Machine Learning

1Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Sahoo, N., Gupta, H., & Bhattacharyya, P. (2022). Detecting Unintended Social Bias in Toxic Language Datasets. In CoNLL 2022 - 26th Conference on Computational Natural Language Learning, Proceedings of the Conference (pp. 132–143). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.conll-1.10

Readers over time

‘22‘23‘24‘2506121824

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 4

44%

Researcher 3

33%

Lecturer / Post doc 2

22%

Readers' Discipline

Tooltip

Computer Science 10

77%

Philosophy 1

8%

Neuroscience 1

8%

Medicine and Dentistry 1

8%

Save time finding and organizing research with Mendeley

Sign up for free
0