Two-stage approach to extracting visual objects from paper documents

Paweł Forczmański; Andrzej Markiewicz

Journal ArticleOPEN ACCESS

Two-stage approach to extracting visual objects from paper documents

Machine Vision and Applications (2016) 27(8) 1243-1257

DOI: 10.1007/s00138-016-0803-5

13Citations

16Readers

Abstract

In the paper we present an approach to the automatic detection and identification of important elements in paper documents. This includes stamps, logos, printed text blocks, signatures and tables. Presented approach consists of two stages. The first one includes object detection by means of AdaBoost cascade of weak classifiers and Haar-like features. Resulting image blocks are, at the second stage, subjected to verification based on selected features calculated from recently proposed low-level descriptors combined with certain classifiers representing current machine-learning approaches. The training phase, for both stages, uses bootstrapping, i.e., integrative process, aiming at increasing the accuracy. Experiments performed on large set of digitized paper documents showed that adopted strategy is useful and efficient.

Author supplied keywords

Cite

CITATION STYLE

APA

Forczmański, P., & Markiewicz, A. (2016). Two-stage approach to extracting visual objects from paper documents. Machine Vision and Applications, 27(8), 1243–1257. https://doi.org/10.1007/s00138-016-0803-5

Two-stage approach to extracting visual objects from paper documents

Abstract

Author supplied keywords

Cite

Register to see more suggestions