Two complementary techniques for digitized document analysis

George Nagy; Junichi Kanai; Mukkai Krishnamoorthy; Mathews Thomas; Mahesh Viswanathan

Conference ProceedingsOPEN ACCESS

Two complementary techniques for digitized document analysis

Proceedings of the ACM Conference on Document Processing Systems, DOCPROCS 1988 (1988) 169-176

DOI: 10.1145/62506.62539

23Citations

8Readers

Abstract

Two complementary methods are proposed for characterizing the spatial structure of digitized technical documents and labelling various logical components without using optical character recognition. The top-down method segments and labels the page image simultaneously using publication-specific information in the form of a page-grammar. The bottom-up method naively segments the document into rectangles that contain individual connected components, combines blocks using knowledge about generic layout objects, and identifies logical objects using publication-specific knowledge. Both methods are based on the X-Y tree representation of a page image. The procedures are demonstrated on scanned and synthesized bit-maps of the title pages of technical articles.

Cite

CITATION STYLE

APA

Nagy, G., Kanai, J., Krishnamoorthy, M., Thomas, M., & Viswanathan, M. (1988). Two complementary techniques for digitized document analysis. In Proceedings of the ACM Conference on Document Processing Systems, DOCPROCS 1988 (pp. 169–176). Association for Computing Machinery, Inc. https://doi.org/10.1145/62506.62539

Two complementary techniques for digitized document analysis

Abstract

Cite

Register to see more suggestions