UJNLP at SemEval-2020 Task 12: Detecting Offensive Language Using Bidirectional Transformers

Yinnan Yao; Nan Su; Kun Ma

Conference Proceedings

UJNLP at SemEval-2020 Task 12: Detecting Offensive Language Using Bidirectional Transformers

14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings (2020) 2203-2208

DOI: 10.18653/v1/2020.semeval-1.293

2Citations

71Readers

Get full text

Abstract

In this paper, we built several pre-trained models to participate SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media. In the common task of Offensive Language Identification in Social Media, pre-trained models such as Bidirectional Encoder Representation from Transformer (BERT) have achieved good results. We preprocess the dataset by the language habits of users in social network. Considering the data imbalance in OffensEval, we screened the newly provided machine annotation samples to construct a new dataset. We use the dataset to fine-tune the Robustly Optimized BERT Pretraining Approach (RoBERTa). For the English subtask B, we adopted the method of adding Auxiliary Sentences (AS) to transform the single-sentence classification task into a relationship recognition task between sentences. Our team UJNLP wins the ranking 16th of 85 in English subtask A (Offensive language identification).

Cite

CITATION STYLE

APA

Yao, Y., Su, N., & Ma, K. (2020). UJNLP at SemEval-2020 Task 12: Detecting Offensive Language Using Bidirectional Transformers. In 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings (pp. 2203–2208). International Committee for Computational Linguistics. https://doi.org/10.18653/v1/2020.semeval-1.293

UJNLP at SemEval-2020 Task 12: Detecting Offensive Language Using Bidirectional Transformers

Abstract

Cite

Register to see more suggestions