ConvAI at SemEval-2019 task 6: Offensive language identification and categorization with perspective and BERT

John Pavlopoulos; Ion Androutsopoulos; Nithum Thain; Lucas Dixon

Conference ProceedingsOPEN ACCESS

ConvAI at SemEval-2019 task 6: Offensive language identification and categorization with perspective and BERT

NAACL HLT 2019 - International Workshop on Semantic Evaluation, SemEval 2019, Proceedings of the 13th Workshop (2019) 571-576

DOI: 10.18653/v1/s19-2102

38Citations

94Readers

Abstract

This paper presents the application of two strong baseline systems for toxicity detection and evaluates their performance in identifying and categorizing offensive language in social media. Perspective is an API, that serves multiple machine learning models for the improvement of conversations online, as well as a toxicity detection system, trained on a wide variety of comments from platforms across the Internet. BERT is a recently popular language representation model, fine tuned per task and achieving state of the art performance in multiple NLP tasks. Perspective performed better than BERT in detecting toxicity, but BERT was much better in categorizing the offensive type. Both baselines were ranked surprisingly high in the SEMEVAL-2019 OFFENSE-VAL competition, Perspective in detecting an offensive post (12th) and BERT in categorizing it (11th). The main contribution of this paper is the assessment of two strong baselines for the identification (Perspective) and the categorization (BERT) of offensive language with little or no additional training data.

Cite

CITATION STYLE

APA

Pavlopoulos, J., Androutsopoulos, I., Thain, N., & Dixon, L. (2019). ConvAI at SemEval-2019 task 6: Offensive language identification and categorization with perspective and BERT. In NAACL HLT 2019 - International Workshop on Semantic Evaluation, SemEval 2019, Proceedings of the 13th Workshop (pp. 571–576). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/s19-2102

ConvAI at SemEval-2019 task 6: Offensive language identification and categorization with perspective and BERT

Abstract

Cite

Register to see more suggestions