Automating Benchmark Generation for Named Entity Recognition and Entity Linking

Katerina Papantoniou; Vasilis Efthymiou; Dimitris Plexousakis

Conference Proceedings

Automating Benchmark Generation for Named Entity Recognition and Entity Linking

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023) 13998 LNCS 143-148

DOI: 10.1007/978-3-031-43458-7_27

1Citations

2Readers

Get full text

Abstract

Named Entity Recognition (NER) and Linking (NEL) have seen great advances lately, especially with the development of language models pre-trained on large document corpora, typically written in the most popular languages (e.g., English). This makes NER and NEL tools for other languages, with fewer resources available, fall behind the latest advances in AI. In this work, we propose an automated benchmark data generation process for the tasks of NER and NEL, based on Wikipedia events. Although our process is applied and evaluated on Greek texts, the only requirement for its applicability to other languages is the availability of Wikipedia events pages in that language. The generated Greek datasets, comprising around 19k events and 41k entity mentions, as well as the code to generate such datasets, are publicly available.

Cite

CITATION STYLE

APA

Papantoniou, K., Efthymiou, V., & Plexousakis, D. (2023). Automating Benchmark Generation for Named Entity Recognition and Entity Linking. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13998 LNCS, pp. 143–148). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-43458-7_27

Automating Benchmark Generation for Named Entity Recognition and Entity Linking

Abstract

Cite

Register to see more suggestions