CurT: End-to-End Text Line Detection in Historical Documents with Transformers

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present the curve transformer (CurT), a novel method of direct baseline detection that models document text line detection as set prediction of cubic Bézier curves, simplifying the layout analysis pipeline by removing the need for the laboriously hand-crafted postprocessing algorithms that are necessary with the current state of the art. CurT combines multiple appealing features: direct prediction enabling processing of material that is ill-suited for the prevailing methods adapting semantic segmentation backbones, a conceptually simple Transformer-based encoder-decoder architecture that can be extended to additional tasks beyond baseline detection, and increased computational efficiency in comparison to older approaches. In addition, we demonstrate that CurT achieves metrics that are competitive with methods based on semantic segmentation. Training and inference code is available under Apache 2.0 license at https://github.com/mittagessen/curt.

Cite

CITATION STYLE

APA

Kiessling, B. (2022). CurT: End-to-End Text Line Detection in Historical Documents with Transformers. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13639 LNCS, pp. 34–48). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-21648-0_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free