UJNLP at SemEval-2020 Task 12: Detecting Offensive Language Using Bidirectional Transformers

2Citations
Citations of this article
71Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we built several pre-trained models to participate SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media. In the common task of Offensive Language Identification in Social Media, pre-trained models such as Bidirectional Encoder Representation from Transformer (BERT) have achieved good results. We preprocess the dataset by the language habits of users in social network. Considering the data imbalance in OffensEval, we screened the newly provided machine annotation samples to construct a new dataset. We use the dataset to fine-tune the Robustly Optimized BERT Pretraining Approach (RoBERTa). For the English subtask B, we adopted the method of adding Auxiliary Sentences (AS) to transform the single-sentence classification task into a relationship recognition task between sentences. Our team UJNLP wins the ranking 16th of 85 in English subtask A (Offensive language identification).

Cite

CITATION STYLE

APA

Yao, Y., Su, N., & Ma, K. (2020). UJNLP at SemEval-2020 Task 12: Detecting Offensive Language Using Bidirectional Transformers. In 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings (pp. 2203–2208). International Committee for Computational Linguistics. https://doi.org/10.18653/v1/2020.semeval-1.293

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free