Efficient similarity search for hierarchical data in large databases

Karin Kailing; Hans Peter Kriegel; Stefan Schönauer; Thomas Seidl

Journal Article

Efficient similarity search for hierarchical data in large databases

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 2992 676-693

DOI: 10.1007/978-3-540-24741-8_39

55Citations

34Readers

Get full text

Abstract

Structured and semi-structured object representations are getting more and more important for modern database applications. Examples for such data are hierarchical structures including chemical compounds, XML data or image data. As a key feature, database systems have to support the search for similar objects where it is important to take into account both the structure and the content features of the objects. A successful approach is to use the edit distance for tree structured data. As the computation of this measure is NP-complete, constrained edit distances have been successfully applied to trees. While yielding good results, they are still computationally complex and, therefore, of limited benefit for searching in large databases. In this paper, we propose a filter and refinement architecture to overcome this problem. We present a set of new filter methods for structural and for content-based information in tree-structured data as well as ways to flexibly combine different filter criteria. The efficiency of our methods, resulting from the good selectivity of the filters is demonstrated in extensive experiments with real-world applications. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Kailing, K., Kriegel, H. P., Schönauer, S., & Seidl, T. (2004). Efficient similarity search for hierarchical data in large databases. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2992, 676–693. https://doi.org/10.1007/978-3-540-24741-8_39

Efficient similarity search for hierarchical data in large databases

Abstract

Cite

Register to see more suggestions