FiNER: Financial Numeric Entity Recognition for XBRL Tagging

Lefteris Loukas; Manos Fergadiotis; Ilias Chalkidis; Eirini Spyropoulou; Prodromos Malakasiotis; Ion Androutsopoulos; George Paliouras

Conference ProceedingsOPEN ACCESS

FiNER: Financial Numeric Entity Recognition for XBRL Tagging

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2022) 1 4419-4431

DOI: 10.18653/v1/2022.acl-long.303

24Citations

70Readers

Abstract

Publicly traded companies are required to submit periodic reports with eXtensive Business Reporting Language (xbrl) word-level tags. Manually tagging the reports is tedious and costly. We, therefore, introduce xbrl tagging as a new entity extraction task for the financial domain and release finer-139, a dataset of 1.1M sentences with gold xbrl tags. Unlike typical entity extraction datasets, finer-139 uses a much larger label set of 139 entity types. Most annotated tokens are numeric, with the correct tag per token depending mostly on context, rather than the token itself. We show that subword fragmentation of numeric expressions harms bert's performance, allowing word-level bilstms to perform better. To improve bert's performance, we propose two simple and effective solutions that replace numeric expressions with pseudo-tokens reflecting original token shapes and numeric magnitudes. We also experiment with fin-bert, an existing bert model for the financial domain, and release our own bert (sec-bert), pre-trained on financial filings, which performs best. Through data and error analysis, we finally identify possible limitations to inspire future work on xbrl tagging.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Loukas, L., Fergadiotis, M., Chalkidis, I., Spyropoulou, E., Malakasiotis, P., Androutsopoulos, I., & Paliouras, G. (2022). FiNER: Financial Numeric Entity Recognition for XBRL Tagging. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 4419–4431). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.acl-long.303

Readers' Seniority

PhD / Post grad / Masters / Doc 17

71%

Lecturer / Post doc 3

13%

Researcher 3

13%

Professor / Associate Prof. 1

Readers' Discipline

Computer Science 20

80%

Linguistics 2

Business, Management and Accounting 2

Neuroscience 1

FiNER: Financial Numeric Entity Recognition for XBRL Tagging

Abstract

References Powered by Scopus

Deep residual learning for image recognition

Speech recognition with deep recurrent neural networks

Neural architectures for named entity recognition

Cited by Powered by Scopus

End-to-End Transformer-Based Models in Textual-Based NLP

Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking

E-NER - An Annotated Named Entity Recognition Corpus of Legal Text

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline