In this paper, we proposed and explored the impact of four different dataset augmentation and extension strategies that we used for solving the subtask 3 of SemEval-2023 Task 3: multilabel persuasion techniques classification in a multi-lingual context. We consider two types of augmentation methods (one based on a modified version of synonym replacement and one based on translations) and two ways of extending the training dataset (using filtered data generated by GPT-3 and using a dataset from a previous competition). We studied the effects of the aforementioned techniques by using the augmented and/or extended training dataset to fine-tune a pretrained XLM-RoBERTa-Large model. Using the augmentation methods alone, we managed to obtain 3rd place for English, 13th place for Italian and between the 5th to 9th places for the other 7 languages during the competition.
CITATION STYLE
Sergiu, A., Laura, C., & George, S. (2023). Appeal for attention at SemEval-2023 Task 3: Data augmentation and extension strategies for detection of online news persuasion techniques. In 17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop (pp. 616–623). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.semeval-1.84
Mendeley helps you to discover research relevant for your work.