Approximate graph schema extraction for semi-structured data

Qiu Yue Wang; Jeffrey Xu Yu; Kam Fai Wong

Conference Proceedings

Approximate graph schema extraction for semi-structured data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2000) 1777 302-316

DOI: 10.1007/3-540-46439-5_21

28Citations

22Readers

Get full text

Abstract

Semi-structured data are typically represented in the form of labeled directed graphs. They are self-describing and schemaless. The lack of a schema renders query processing over semi-structured data expensive. To overcome this predicament, some researchers proposed to use the structure of the data for schema representation. Such schemas are commonly referred to as graph schemas. Nevertheless, since semistructured data are irregular and frequently subjected to modifications, it is costly to construct an accurate graph schema and worse still, it is difficult to maintain it thereafter. Furthermore, an accurate graph schema is generally very large, hence impractical. In this paper, an approximation approach is proposed for graph schema extraction. Approximation is achieved by summarizing the semi-structured data graph using an incremental clustering method. The preliminary experimental results have shown that approximate graph schemas were more compact than the conventional accurate graph schemas and promising in query evaluation that involved regular path expressions.

Cite

CITATION STYLE

APA

Wang, Q. Y., Yu, J. X., & Wong, K. F. (2000). Approximate graph schema extraction for semi-structured data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1777, pp. 302–316). Springer Verlag. https://doi.org/10.1007/3-540-46439-5_21

Approximate graph schema extraction for semi-structured data

Abstract

Cite

Register to see more suggestions