Sakura at SemEval-2023 Task 2: Data Augmentation via Translation

1Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

We demonstrate a simple yet effective approach to augmenting training data for multilingual named entity recognition using machine translation. The named entity spans from the original sentences are transferred to the translations via word alignment and then filtered with the baseline recognizer to retain high quality annotations. The proposed data augmentation approach improves the baseline performance of XLM-Roberta on the multilingual dataset.

Cite

CITATION STYLE

APA

Poncelas, A., Tkachenko, M., & Htun, O. (2023). Sakura at SemEval-2023 Task 2: Data Augmentation via Translation. In 17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop (pp. 1718–1722). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.semeval-1.239

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free