MELHISSA: a multilingual entity linking architecture for historical press articles

9Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Digital libraries have a key role in cultural heritage as they provide access to our culture and history by indexing books and historical documents (newspapers and letters). Digital libraries use natural language processing (NLP) tools to process these documents and enrich them with meta-information, such as named entities. Despite recent advances in these NLP models, most of them are built for specific languages and contemporary documents that are not optimized for handling historical material that may for instance contain language variations and optical character recognition (OCR) errors. In this work, we focused on the entity linking (EL) task that is fundamental to the indexation of documents in digital libraries. We developed a Multilingual Entity Linking architecture for HIstorical preSS Articles that is composed of multilingual analysis, OCR correction, and filter analysis to alleviate the impact of historical documents in the EL task. The source code is publicly available. Experimentation has been done over two historical document corpora covering five European languages (English, Finnish, French, German, and Swedish). Results have shown that our system improved the global performance for all languages and datasets by achieving an F-score@1 of up to 0.681 and an F-score@5 of up to 0.787.

References Powered by Scopus

Long Short-Term Memory

77222Citations
N/AReaders
Get full text

DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia

2465Citations
N/AReaders
Get full text

Entity linking with a knowledge base: Issues, techniques, and solutions

610Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Leveraging Open Large Language Models for Historical Named Entity Recognition

1Citations
N/AReaders
Get full text

Representation of Andean Communities: Indigenous Cultures and Languages in Peruvian Cinema

1Citations
N/AReaders
Get full text

Musical heritage historical entity linking

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Linhares Pontes, E., Cabrera-Diego, L. A., Moreno, J. G., Boros, E., Hamdi, A., Doucet, A., … Coustaty, M. (2022). MELHISSA: a multilingual entity linking architecture for historical press articles. International Journal on Digital Libraries, 23(2), 133–160. https://doi.org/10.1007/s00799-021-00319-6

Readers over time

‘22‘23‘24‘25036912

Readers' Seniority

Tooltip

Researcher 2

50%

Lecturer / Post doc 1

25%

PhD / Post grad / Masters / Doc 1

25%

Readers' Discipline

Tooltip

Computer Science 3

60%

Business, Management and Accounting 1

20%

Social Sciences 1

20%

Save time finding and organizing research with Mendeley

Sign up for free
0