LIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification

6Citations
Citations of this article
76Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents our system entitled 'LIIR' for SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2). We have participated in Subtask A for English, Danish, Greek, Arabic, and Turkish languages. We adapt and fine-tune the BERT and multilingual Bert models made available by Google AI for English and non-English languages respectively. For the English language, we use a combination of two fine-tuned BERT models. For other languages, we propose a cross-lingual augmentation approach in order to enrich training data and we use multilingual BERT to obtain sentence representations. LIIR achieved rank 14/38, 18/47, 24/86, 24/54, and 25/40 in Greek, Turkish, English, Arabic, and Danish languages, respectively.

Cite

CITATION STYLE

APA

Ghadery, E., & Moens, M. F. (2020). LIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification. In 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings (pp. 2073–2079). International Committee for Computational Linguistics. https://doi.org/10.18653/v1/2020.semeval-1.274

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free