Two complementary techniques for digitized document analysis

23Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Two complementary methods are proposed for characterizing the spatial structure of digitized technical documents and labelling various logical components without using optical character recognition. The top-down method segments and labels the page image simultaneously using publication-specific information in the form of a page-grammar. The bottom-up method naively segments the document into rectangles that contain individual connected components, combines blocks using knowledge about generic layout objects, and identifies logical objects using publication-specific knowledge. Both methods are based on the X-Y tree representation of a page image. The procedures are demonstrated on scanned and synthesized bit-maps of the title pages of technical articles.

Cite

CITATION STYLE

APA

Nagy, G., Kanai, J., Krishnamoorthy, M., Thomas, M., & Viswanathan, M. (1988). Two complementary techniques for digitized document analysis. In Proceedings of the ACM Conference on Document Processing Systems, DOCPROCS 1988 (pp. 169–176). Association for Computing Machinery, Inc. https://doi.org/10.1145/62506.62539

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free