Automating Benchmark Generation for Named Entity Recognition and Entity Linking

1Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Named Entity Recognition (NER) and Linking (NEL) have seen great advances lately, especially with the development of language models pre-trained on large document corpora, typically written in the most popular languages (e.g., English). This makes NER and NEL tools for other languages, with fewer resources available, fall behind the latest advances in AI. In this work, we propose an automated benchmark data generation process for the tasks of NER and NEL, based on Wikipedia events. Although our process is applied and evaluated on Greek texts, the only requirement for its applicability to other languages is the availability of Wikipedia events pages in that language. The generated Greek datasets, comprising around 19k events and 41k entity mentions, as well as the code to generate such datasets, are publicly available.

Cite

CITATION STYLE

APA

Papantoniou, K., Efthymiou, V., & Plexousakis, D. (2023). Automating Benchmark Generation for Named Entity Recognition and Entity Linking. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13998 LNCS, pp. 143–148). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-43458-7_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free