Pre-training and evaluation of numeracy-oriented language model

0Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Pre-trained language model (LM) has led to significant performance gains in various natural language processing (NLP) applications due to its strong literacy, e.g., the ability to capture word dependencies. However, the existing pre-trained LMs largely ignore numeracy, i.e., treating numbers within text as plain words and without understanding the basic numerical concepts. The weak numeracy has become a barrier to the use of pre-trained LMs in NLP applications over financial documents such as annual filings and analyst reports that are number intensive. However, the understanding and analysis of financial documents are becoming gradationally important. To bridge this gap, this work explores the central theme of numerical pre-training to empower LM with numeracy. In particular, we propose two numerical pre-training methods with objectives that encourage the LM to understand the magnitude and value of numbers and encode the dependency between a number and its context. By applying the proposed methods on BERT, we pre-train two LMs, named BERT-M and BERT-V. Moreover, we construct four datasets of financial documents for evaluating the numeracy of pre-trained LM, which focus on three fundamental perspectives of numeracy: a) number embedding; b) number-text composition; and c) number-number composition. Extensive experiments on the datasets validate the effectiveness of the pre-trained BERT-M and BERT-V, which outperform the state-of-the-art LM for financial documents (FinBERT) by 4.83% and 4.34% on average. Furthermore, their aggregation named BERT-MV increases the gain to 10.88%.

Cite

CITATION STYLE

APA

Feng, F., Rui, X., Wang, W., Cao, Y., & Chua, T. S. (2021). Pre-training and evaluation of numeracy-oriented language model. In ICAIF 2021 - 2nd ACM International Conference on AI in Finance. Association for Computing Machinery, Inc. https://doi.org/10.1145/3490354.3494412

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free