Abstract
Reducing and counter-acting hate speech on Social Media is a significant concern. Most of the proposed automatic methods are conducted exclusively on English and very few consistently labeled, non-English resources have been proposed. Learning to detect hate speech on English and transferring to unseen languages seems an immediate solution. This work is the first to shed light on the limits of this zero-shot, cross-lingual transfer learning framework for hate speech detection. We use benchmark data sets in English, Italian, and Spanish to detect hate speech towards immigrants and women. Investigating post-hoc explanations of the model, we discover that nonhateful, language-specific taboo interjections are misinterpreted as signals of hate speech. Our findings demonstrate that zero-shot, crosslingual models cannot be used as they are, but need to be carefully designed.
Cite
CITATION STYLE
Nozza, D. (2021). Exposing the limits of Zero-shot Cross-lingual Hate Speech Detection. In ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (Vol. 2, pp. 907–914). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.acl-short.114
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.