Page segmentation techniques in document analysis

35Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this chapter, we describe various notions and methods of page segmentation, which is to segment page images into homogeneous components such as text blocks, figures, and tables. It constitutes the whole process called layout analysis along with the classification of segmented components described in Chap. 7 (Page Similarity and Classification). This chapter starts with classification of page layout structures from various viewpoints including different levels of components and printing colors. Then we classify methods to handle each class of layout. This is done based on three viewpoints: (1) objects to be analyzed, foreground or background; (2) primitives of analysis, pixels, connected components, maximal empty rectangles, etc.; (3) strategy of analysis, top-down and bottom-up. The details of classified methods are described and compared with one another to know pros and cons of these methods.

Cite

CITATION STYLE

APA

Kise, K. (2014). Page segmentation techniques in document analysis. In Handbook of Document Image Processing and Recognition (pp. 135–175). Springer London. https://doi.org/10.1007/978-0-85729-859-1_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free