Sartipi-Sedighin at SemEval-2023 Task 2: Fine-grained Named Entity Recognition with Pre-trained Contextual Language Models and Data Augmentation from Wikipedia

Amir Sartipi; Amirreza Sedighin; Afsaneh Fatemi; Hamidreza Baradaran Kashani

Conference ProceedingsOPEN ACCESS

Sartipi-Sedighin at SemEval-2023 Task 2: Fine-grained Named Entity Recognition with Pre-trained Contextual Language Models and Data Augmentation from Wikipedia

17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop (2023) 565-579

DOI: 10.18653/v1/2023.semeval-1.78

1Citations

12Readers

Abstract

This paper presents the system developed by the Sartipi-Sedighin team for SemEval 2023 Task 2, which is a shared task focused on multilingual complex named entity recognition (NER), or MultiCoNER II. The goal of this task is to identify and classify complex named entities (NEs) in text across multiple languages. To tackle the MultiCoNER II task, we leveraged pre-trained language models (PLMs) fine-tuned for each language included in the dataset. In addition, we also applied a data augmentation technique to increase the amount of training data available to our models. Specifically, we searched for relevant NEs that already existed in the training data within Wikipedia, and we added new instances of these entities to our training corpus. Our team achieved an overall F1 score of 61.25% in the English track and 71.79% in the multilingual track across all 13 tracks of the shared task that we submitted to.

Cite

CITATION STYLE

APA

Sartipi, A., Sedighin, A., Fatemi, A., & Kashani, H. B. (2023). Sartipi-Sedighin at SemEval-2023 Task 2: Fine-grained Named Entity Recognition with Pre-trained Contextual Language Models and Data Augmentation from Wikipedia. In 17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop (pp. 565–579). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.semeval-1.78

Sartipi-Sedighin at SemEval-2023 Task 2: Fine-grained Named Entity Recognition with Pre-trained Contextual Language Models and Data Augmentation from Wikipedia

Abstract

Cite

Register to see more suggestions