Representative image selection for data efficient word spotting

2Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper compares three different word image representations as base for label free sample selection for word spotting in historical handwritten documents. These representations are a temporal pyramid representation based on pixel counts, a graph based representation, and a pyramidal histogram of characters (PHOC) representation predicted by a PHOCNet trained on synthetic data. We show that the PHOC representation can help to reduce the amount of required training samples by up to 69% depending on the dataset, if it is learned iteratively in an active learning like fashion. While this works for larger datasets containing about 1,700 images, for smaller datasets with 100 images, we find that the temporal pyramid and the graph representation perform better.

Cite

CITATION STYLE

APA

Westphal, F., Grahn, H., & Lavesson, N. (2020). Representative image selection for data efficient word spotting. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12116 LNCS, pp. 383–397). Springer. https://doi.org/10.1007/978-3-030-57058-3_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free