Improving Handwriting Recognition for Historical Documents Using Synthetic Text Lines

2Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Automatic handwriting recognition for historical documents is a key element for making our cultural heritage available to researchers and the general public. However, current approaches based on machine learning require a considerable amount of annotated learning samples to read ancient scripts and languages. Producing such ground truth is a laborious and time-consuming task that often requires human experts. In this paper, to cope with a limited amount of learning samples, we explore the impact of using synthetic text line images to support the training of handwriting recognition systems. For generating text lines, we consider lineGen, a recent GAN-based approach, and for handwriting recognition, we consider HTR-Flor, a state-of-the-art recognition system. Different meta-learning strategies are explored that schedule the addition of synthetic text line images to the existing real samples. In an experimental evaluation on the well-known Bentham dataset as well as the newly introduced Bullinger dataset, we demonstrate a significant improvement of the recognition performance when combining real and synthetic samples.

Cite

CITATION STYLE

APA

Spoto, M., Wolf, B., Fischer, A., & Scius-Bertrand, A. (2022). Improving Handwriting Recognition for Historical Documents Using Synthetic Text Lines. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13424 LNCS, pp. 61–75). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19745-1_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free