UMUTeam and SINAI at SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis using Multilingual Large Language Models and Data Augmentation

0Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

This work presents the participation of the UMUTeam and the SINAI research groups in the SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis. The goal of this task is to predict the intimacy of a set of tweets in 10 languages: English, Spanish, Italian, Portuguese, French, Chinese, Hindi, Arabic, Dutch and Korean, of which, the last 4 are not in the training data. Our approach to address this task is based on data augmentation and the use of three multilingual Large Language Models (multilingual BERT, XLM and mDeBERTA) by ensemble learning. Our team ranked 30th out of 45 participants. Our best results were achieved with two unseen languages: Korean (16th) and Hindi (19th).

Cite

CITATION STYLE

APA

García-Díaz, J. A., Pan, R., Zafra, S. M. J., Martín-Valdivia, M. T., Ureña-López, L. A., & Valencia-García, R. (2023). UMUTeam and SINAI at SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis using Multilingual Large Language Models and Data Augmentation. In 17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop (pp. 293–299). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.semeval-1.39

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free