Clustering XML documents by structure

Theodore Dalamagas; Tao Cheng; Klaas Jan Winkel; Timos Sellis

Conference Proceedings

Clustering XML documents by structure

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (2004) 3025 112-121

DOI: 10.1007/978-3-540-24674-9_13

24Citations

30Readers

Get full text

Abstract

This work explores the application of clustering methods for grouping structurally similar XML documents. Modeling the XML documents as rooted ordered labeled trees, we apply clustering algorithms using distances that estimate the similarity between those trees in terms of the hierarchical relationships of their nodes. We suggest the usage of tree structural summaries to improve the performance of the distance calculation and at the same time to maintain or even improve its quality. Experimental results are provided using a prototype testbed.

Author supplied keywords

Cite

CITATION STYLE

APA

Dalamagas, T., Cheng, T., Winkel, K. J., & Sellis, T. (2004). Clustering XML documents by structure. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3025, pp. 112–121). Springer Verlag. https://doi.org/10.1007/978-3-540-24674-9_13

Clustering XML documents by structure

Abstract

Author supplied keywords

Cite

Register to see more suggestions