Efficient similarity search for tree-structured data

Guoliang Li; Xuhui Liu; Jianhua Feng; Lizhu Zhou

Conference Proceedings

Efficient similarity search for tree-structured data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 5069 LNCS 131-149

DOI: 10.1007/978-3-540-69497-7_11

8Citations

21Readers

Get full text

Abstract

Tree-structured data are becoming ubiquitous nowadays and manipulating them based on similarity is essential for many applications. Although similarity search on textual data has been extensively studied, searching for similar trees is still an open problem due to the high complexity of computing the similarity between trees, especially for large numbers of tress. In this paper, we propose to transform tree-structured data into strings with a one-to-one mapping. We prove that the edit distance of the corresponding strings forms a bound for the similarity measures between trees, including tree edit distance, largest common subtrees and smallest common super-trees. Based on the theoretical analysis, we can employ any existing algorithm of approximate string search for effective similarity search on trees. Moreover, we embed the bound into a filter-and-refine framework for facilitating similarity search on tree-structured data. The experimental results show that our algorithm achieves high performance and outperforms state-of-the-art methods significantly. Our method is especially suitable for accelerating similarity query processing on large numbers of trees in massive datasets. © 2008 Springer-Verlag.

Cite

CITATION STYLE

APA

Li, G., Liu, X., Feng, J., & Zhou, L. (2008). Efficient similarity search for tree-structured data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5069 LNCS, pp. 131–149). https://doi.org/10.1007/978-3-540-69497-7_11

Efficient similarity search for tree-structured data

Abstract

Cite

Register to see more suggestions