Combining Self-training and Minimal Annotations for Handwritten Word Recognition

1Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Handwritten Text Recognition (HTR) relies on deep learning to achieve high performances. Its success is substantially driven by large annotated training datasets resulting in powerful recognition models. Performances suffer considerably when applied to document collections with a distinctive style that is not well represented by training data. Applying a recognition model to a new data collection poses a tremendous annotation effort, which is often out of scope, for example considering historic collections. To overcome this limitation, we propose a training scheme that combines multiple data sources. Synthetically generated samples are used to train an initial model. Self-training offers the possibility to exploit unlabeled samples. We further investigate the question of how a small number of manually annotated samples can be integrated to achieve maximal performance with limited annotation effort. Therefore, we add labeled samples at different stages of self-training and propose two criteria, namely confidence and diversity, for the selection of samples to annotate. In our experiments, we show that the proposed training scheme is able to considerably close the gap to fully-supervised training on the designated training set with less than ten percent of the labeling demand.

Cite

CITATION STYLE

APA

Wolf, F., & Fink, G. A. (2022). Combining Self-training and Minimal Annotations for Handwritten Word Recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13639 LNCS, pp. 300–315). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-21648-0_21

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free