Not all swear words are used equal: Attention over word n-grams for abusive language identification

4Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The increasing propagation of abusive language in social media is a major concern for supplier companies and governments because of its negative social impact. A large number of methods have been developed for its automatic identification, ranging from dictionary-based methods to sophisticated deep learning approaches. A common problem in all these methods is to distinguish the offensive use of swear words from their everyday and humorous usage. To tackle this particular issue we propose an attention-based neural network architecture that captures the word n-grams importance according to their context. The obtained results in four standard collections from Twitter and Facebook are encouraging, they outperform the $$F:1$$ scores from state-of-the-art methods and allow identifying a set of inherently offensive swear words, and others in which its interpretation depends on its context.

Cite

CITATION STYLE

APA

Jarquín-Vásquez, H. J., Montes-y-Gómez, M., & Villaseñor-Pineda, L. (2020). Not all swear words are used equal: Attention over word n-grams for abusive language identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12088 LNCS, pp. 282–292). Springer. https://doi.org/10.1007/978-3-030-49076-8_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free