Ensemble-based Semi-Supervised Learning for Hate Speech Detection

5Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Large and accurately labeled textual corpora are vital to developing efficient hate speech classifiers. This paper introduces an ensemble-based semi-supervised learning approach to leverage the availability of abundant social media content. Starting with a reliable hate speech dataset, we train and test diverse classifiers that are then used to label a corpus of one million tweets. Next, we investigate several strategies to select the most confident labels from the obtained pseudo labels. We assess these strategies by re-training all the classifiers with the seed dataset augmented with the trusted pseudo-labeled data. Finally, we demonstrate that our approach improves classification performance over supervised hate speech classification methods.

Cite

CITATION STYLE

APA

Alsafari, S., & Sadaoui, S. (2021). Ensemble-based Semi-Supervised Learning for Hate Speech Detection. In Proceedings of the International Florida Artificial Intelligence Research Society Conference, FLAIRS (Vol. 34). Florida Online Journals, University of Florida. https://doi.org/10.32473/flairs.v34i1.128427

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free