FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction

45Citations
Citations of this article
87Readers
Mendeley users who have this article in their library.

Abstract

Sequence modeling has demonstrated state-of-the-art performance on natural language and document understanding tasks. However, it is challenging to correctly serialize tokens in form-like documents in practice due to their variety of layout patterns. We propose FormNet, a structure-aware sequence model to mitigate the suboptimal serialization of forms. First, we design Rich Attention that leverages the spatial relationship between tokens in a form for more precise attention score calculation. Second, we construct Super-Tokens for each word by embedding representations from their neighboring tokens through graph convolutions. FormNet therefore explicitly recovers local syntactic information that may have been lost during serialization. In experiments, FormNet outperforms existing methods with a more compact model size and less pretraining data, establishing new state-of-the-art performance on CORD, FUNSD and Payment benchmarks.

Cite

CITATION STYLE

APA

Lee, C. Y., Li, C. L., Dozat, T., Perot, V., Su, G., Hua, N., … Pfister, T. (2022). FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 3735–3754). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.acl-long.260

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free