A Character-Level Document Key Information Extraction Method with Contrastive Learning

Xinpeng Zhang; Jiyao Deng; Liangcai Gao

Conference Proceedings

A Character-Level Document Key Information Extraction Method with Contrastive Learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023) 14189 LNCS 216-230

DOI: 10.1007/978-3-031-41682-8_14

0Citations

1Readers

Get full text

Abstract

Key information extraction (KIE) from documents has become a major area of focus in the field of natural language processing. However, practical applications often involve documents that contain visual elements, such as icons, tables, and images, which complicates the process of information extraction. Many of current methods require large pre-trained language models or multi-modal data inputs, leading to demanding requirements for the quality of the data-set and extensive training times. Furthermore, KIE datasets frequently suffer from out-of-vocabulary (OOV) issues. To address these challenges, this paper proposes a document KIE method based on the encoder-decoder model. To effectively handle the OOV problem, we use a character-level CNN to encode document information. We also introduce a label feedback mechanism in the decoder to provide the label embedding back to the encoder for predicting adjacent fields. Additionally, we propose a similarity module based on contrastive learning to address the problem of content diversity. Our method requires only text inputs, has fewer parameters, but still achieves comparable results with state-of-the-art methods on the document KIE task.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, X., Deng, J., & Gao, L. (2023). A Character-Level Document Key Information Extraction Method with Contrastive Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14189 LNCS, pp. 216–230). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-41682-8_14

A Character-Level Document Key Information Extraction Method with Contrastive Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions