Not all swear words are used equal: Attention over word n-grams for abusive language identification

Horacio Jesús Jarquín-Vásquez; Manuel Montes-y-Gómez; Luis Villaseñor-Pineda

Conference ProceedingsOPEN ACCESS

Not all swear words are used equal: Attention over word n-grams for abusive language identification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12088 LNCS 282-292

DOI: 10.1007/978-3-030-49076-8_27

4Citations

7Readers

Abstract

The increasing propagation of abusive language in social media is a major concern for supplier companies and governments because of its negative social impact. A large number of methods have been developed for its automatic identification, ranging from dictionary-based methods to sophisticated deep learning approaches. A common problem in all these methods is to distinguish the offensive use of swear words from their everyday and humorous usage. To tackle this particular issue we propose an attention-based neural network architecture that captures the word n-grams importance according to their context. The obtained results in four standard collections from Twitter and Facebook are encouraging, they outperform the $$F:1$$ scores from state-of-the-art methods and allow identifying a set of inherently offensive swear words, and others in which its interpretation depends on its context.

Author supplied keywords

Cite

CITATION STYLE

APA

Jarquín-Vásquez, H. J., Montes-y-Gómez, M., & Villaseñor-Pineda, L. (2020). Not all swear words are used equal: Attention over word n-grams for abusive language identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12088 LNCS, pp. 282–292). Springer. https://doi.org/10.1007/978-3-030-49076-8_27

Not all swear words are used equal: Attention over word n-grams for abusive language identification

Abstract

Author supplied keywords

Cite

Register to see more suggestions