XHATE-999: Analyzing and Detecting Abusive Language Across Domains and Languages

43Citations
Citations of this article
74Readers
Mendeley users who have this article in their library.

Abstract

We present XHATE-999, a multi-domain and multilingual evaluation data set for abusive language detection. By aligning test instances across six typologically diverse languages, XHATE-999 for the first time allows for disentanglement of the domain transfer and language transfer effects in abusive language detection. We conduct a series of domain- and language-transfer experiments with state-of-the-art monolingual and multilingual transformer models, setting strong baseline results and profiling XHATE-999 as a comprehensive evaluation resource for abusive language detection. Finally, we show that domain- and language-adaptation, via intermediate masked language modeling on abusive corpora in the target language, can lead to substantially improved abusive language detection in the target language in the zero-shot transfer setups.

Cite

CITATION STYLE

APA

Glavaš, G., Karan, M., & Vulić, I. (2020). XHATE-999: Analyzing and Detecting Abusive Language Across Domains and Languages. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 6350–6365). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.559

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free