A Character-Level Document Key Information Extraction Method with Contrastive Learning

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Key information extraction (KIE) from documents has become a major area of focus in the field of natural language processing. However, practical applications often involve documents that contain visual elements, such as icons, tables, and images, which complicates the process of information extraction. Many of current methods require large pre-trained language models or multi-modal data inputs, leading to demanding requirements for the quality of the data-set and extensive training times. Furthermore, KIE datasets frequently suffer from out-of-vocabulary (OOV) issues. To address these challenges, this paper proposes a document KIE method based on the encoder-decoder model. To effectively handle the OOV problem, we use a character-level CNN to encode document information. We also introduce a label feedback mechanism in the decoder to provide the label embedding back to the encoder for predicting adjacent fields. Additionally, we propose a similarity module based on contrastive learning to address the problem of content diversity. Our method requires only text inputs, has fewer parameters, but still achieves comparable results with state-of-the-art methods on the document KIE task.

Cite

CITATION STYLE

APA

Zhang, X., Deng, J., & Gao, L. (2023). A Character-Level Document Key Information Extraction Method with Contrastive Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14189 LNCS, pp. 216–230). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-41682-8_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free