Bias and comparison framework for abusive language datasets

  • Wich M
  • Eder T
  • Al Kuwatly H
  • et al.
N/ACitations
Citations of this article
29Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Recently, numerous datasets have been produced as research activities in the field of automatic detection of abusive language or hate speech have increased. A problem with this diversity is that they often differ, among other things, in context, platform, sampling process, collection strategy, and labeling schema. There have been surveys on these datasets, but they compare the datasets only superficially. Therefore, we developed a bias and comparison framework for abusive language datasets for their in-depth analysis and to provide a comparison of five English and six Arabic datasets. We make this framework available to researchers and data scientists who work with such datasets to be aware of the properties of the datasets and consider them in their work.

Cite

CITATION STYLE

APA

Wich, M., Eder, T., Al Kuwatly, H., & Groh, G. (2022). Bias and comparison framework for abusive language datasets. AI and Ethics, 2(1), 79–101. https://doi.org/10.1007/s43681-021-00081-0

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free