Document structure analysis based on layout and textual features

  • Klink S
  • Dengel A
  • Kieninger T
N/ACitations
Citations of this article
50Readers
Mendeley users who have this article in their library.

Abstract

Document image processing is a crucial process in the office automation and begins from the 'OCR' phase with difficulty of the document 'analysis' and 'understanding'. This paper presents a hybrid and comprehensive approach to document structure analysis. Hybrid in the sense, that it makes use of layout (geometrical) as well as textual features of a given document. These features are the base for potential conditions which in turn are used to express fuzzy matched rules of an underlying rule base. Rules can be formulated based on features which might be observed within one specific layout object. But furthermore, rules can also express dependencies between different layout objects. In addition to its rule driven analysis, which allows an easy adaptation to specific domains with their specific logical objects, the system contains domain-independent markup algorithms for common objects (e.g. lists). 1 Introduction In the office automation context processing, filing and retrieving of d...

Cite

CITATION STYLE

APA

Klink, S., Dengel, A., & Kieninger, T. (2000). Document structure analysis based on layout and textual features. Proc. of International Workshop on Document Analysis Systems, DAS2000, 99–111. Retrieved from http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Document+Structure+Analysis+Based+on+Layout+and+Textual+Features#0

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free