WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER

68Citations
Citations of this article
71Readers
Mendeley users who have this article in their library.

Abstract

Multilingual Named Entity Recognition (NER) is a key intermediate task which is needed in many areas of NLP. In this paper, we address the well-known issue of data scarcity in NER, especially relevant when moving to a multilingual scenario, and go beyond current approaches to the creation of multilingual silver data for the task. We exploit the texts of Wikipedia and introduce a new methodology based on the effective combination of knowledge-based approaches and neural models, together with a novel domain adaptation technique, to produce high-quality training corpora for NER. We evaluate our datasets extensively on standard benchmarks for NER, yielding substantial improvements of up to 6 span-based F1-score points over previous state-of-the-art systems for data creation.

Cite

CITATION STYLE

APA

Tedeschi, S., Maiorca, V., Campolungo, N., Cecconi, F., & Navigli, R. (2021). WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER. In Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (pp. 2521–2533). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-emnlp.215

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free