Data model for document transformation and assembly

15Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper shows a data model for transforming and assembling document information such as SGML or XML documents. The biggest advantage over other data models is that this data model simultaneously provides (1) powerful patterns and contextual conditions, and (2) schema transformation. Patterns and contextual conditions capture conditions on subordinates and those on superiors, siblings, subordinates of siblings, etc, respectively, and have been recognized as highly important mechanisms for identifying document components in the document processing community. Meanwhile, schema transformation has been, since the RDB, recognized as crucial in the database community. However, no data models have provided all three of patterns, contextual conditions, and schema transformation. This data model is based on the forest-regular language theory. A schema is a forest automaton and an instance is a finite set of forests (sequences of trees). Since the parse tree set of an extended-context free grammar is accepted by a forest automaton, this model is a generalization of Gon-net and Tompa's grammatical model. Patterns are captured as forest automatons; contextual conditions are pointed forest representations (a variation of Podelski’s pointed tree representations). Controlled by patterns and contextual conditions, an operator creates an instance from an input instance and also creates a reasonably small schema from an input schema. Furthermore, the created schema is often minimally sufficient; any forest permitted by it may be generated by some input instance.

Cite

CITATION STYLE

APA

Murata, M. (1998). Data model for document transformation and assembly. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1481, pp. 140–152). Springer Verlag. https://doi.org/10.1007/3-540-49654-8_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free