A named entity extraction system for historical financial data

Wassim Swaileh; Thierry Paquet; Sébastien Adam; Andres Rojas Camacho

Conference Proceedings

A named entity extraction system for historical financial data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12116 LNCS 324-340

DOI: 10.1007/978-3-030-57058-3_23

2Citations

11Readers

Get full text

Abstract

Access to long-run historical data in the field of social sciences, economics and political sciences has been identified as one necessary condition to understand the dynamics of the past and the way those dynamics structure our present and future. Financial yearbooks are historical records reporting on information about the companies of stock exchanges. This paper concentrates on the description of the key components that implement a financial information extraction system from financial yearbooks. The proposed system consists in three steps: OCR, linked named entities extraction, active learning. The core of the system is related to linked named entities extraction (LNE). LNE are coherent n-tuple of named entities describing high level semantic information. In this respect we developed, tested and compared a CRF and a hybrid RNN/CRF based system. Active learning allows to cope with the lack of annotated data for training the system. Promising performance results are reported on two yearbooks (the French Desfossé yearbook (1962) and the German Handbuch (1914–15)) and for two LNE extraction tasks: capital information of companies and constitution information of companies.

Author supplied keywords

Cite

CITATION STYLE

APA

Swaileh, W., Paquet, T., Adam, S., & Rojas Camacho, A. (2020). A named entity extraction system for historical financial data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12116 LNCS, pp. 324–340). Springer. https://doi.org/10.1007/978-3-030-57058-3_23

A named entity extraction system for historical financial data

Abstract

Author supplied keywords

Cite

Register to see more suggestions