Representative image selection for data efficient word spotting

Florian Westphal; Håkan Grahn; Niklas Lavesson

Conference Proceedings

Representative image selection for data efficient word spotting

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12116 LNCS 383-397

DOI: 10.1007/978-3-030-57058-3_27

2Citations

3Readers

Get full text

Abstract

This paper compares three different word image representations as base for label free sample selection for word spotting in historical handwritten documents. These representations are a temporal pyramid representation based on pixel counts, a graph based representation, and a pyramidal histogram of characters (PHOC) representation predicted by a PHOCNet trained on synthetic data. We show that the PHOC representation can help to reduce the amount of required training samples by up to 69% depending on the dataset, if it is learned iteratively in an active learning like fashion. While this works for larger datasets containing about 1,700 images, for smaller datasets with 100 images, we find that the temporal pyramid and the graph representation perform better.

Author supplied keywords

Cite

CITATION STYLE

APA

Westphal, F., Grahn, H., & Lavesson, N. (2020). Representative image selection for data efficient word spotting. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12116 LNCS, pp. 383–397). Springer. https://doi.org/10.1007/978-3-030-57058-3_27

Representative image selection for data efficient word spotting

Abstract

Author supplied keywords

Cite

Register to see more suggestions