Detection of Toxic Language in Short Text Messages

8Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The ever-increasing online communication landscape provides circumstances for people with significant differences in their views to cross paths unlike it was ever possible before. This leads to the raise of toxicity in online comments and discussions and makes the development of means to detect instances of such phenomenon critically important. The toxic language detection problem is fairly researched and some solutions produce highly accurate predictions when significantly large datasets are available for training. However, such datasets are not always available for various languages. In this paper, we review different ways to approach the problem targeting transferring knowledge from one language to another: machine translation, multi-lingual models, and domain adaptation. We also focus on the analysis of methods for word embedding such as Word2Vec, FastText, GloVe, BERT, and methods for classification of toxic comment: Naïve Bayes, Random Forest, Logistic regression, Support Vector Machine, Majority vote, and Recurrent Neural Networks. We demonstrate that for small datasets in the Russian language, traditional machine-learning techniques produce highly competitive results on par with deep learning methods, and also that machine translation of the dataset to the English language produces more accurate results than multi-lingual models.

Cite

CITATION STYLE

APA

Makhnytkina, O., Matveev, A., Bogoradnikova, D., Lizunova, I., Maltseva, A., & Shilkina, N. (2020). Detection of Toxic Language in Short Text Messages. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12335 LNAI, pp. 315–325). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60276-5_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free