Cross-Domain Detection of Abusive Language Online

71Citations
Citations of this article
126Readers
Mendeley users who have this article in their library.

Abstract

We investigate to what extent the models trained to detect general abusive language generalize between different datasets labeled with different abusive language types. To this end, we compare the cross-domain performance of simple classification models on nine different datasets, finding that the models fail to generalize to out-domain datasets and that having at least some in-domain data is important. We also show that using the frustratingly simple domain adaptation (Daume III, 2007) in most cases improves the results over in-domain training, especially when used to augment a smaller dataset with a larger one.

Cite

CITATION STYLE

APA

Karan, M., & Šnajder, J. (2018). Cross-Domain Detection of Abusive Language Online. In 2nd Workshop on Abusive Language Online - Proceedings of the Workshop, co-located with EMNLP 2018 (pp. 132–137). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w18-5117

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free