Abstract
This paper describes our system designed for SemEval-2022 Task 8: Multilingual News Article Similarity. We proposed a linguistics-inspired model trained with a few task-specific strategies. The main techniques of our system are: 1) data augmentation, 2) multi-label loss, 3) adapted R-Drop, 4) samples reconstruction with the head-tail combination. We also present a brief analysis of some negative methods like two-tower architecture. Our system ranked 1st on the leaderboard while achieving a Pearson's Correlation Coefficient of 0.818 on the official evaluation set.
Cite
CITATION STYLE
Xu, Z., Yang, Z., Cui, Y., & Chen, Z. (2022). HFL at SemEval-2022 Task 8: A Linguistics-inspired Regression Model with Data Augmentation for Multilingual News Similarity. In SemEval 2022 - 16th International Workshop on Semantic Evaluation, Proceedings of the Workshop (pp. 1114–1120). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.semeval-1.157
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.