The q-gram distance for ordered unlabeled trees

6Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we investigate the q-gram distance for ordered unlabeled trees (trees, for short). First, we formulate a q-gram as simply a tree with q nodes isomorphic to a line graph, and the q-gram distance between two trees as similar as one between two strings. Then, by using the depth sequence based on postorder, we design the algorithm EnumGram to enumerate all q-grams in a tree T with n nodes which runs in O(n2) time and in O(q) space. Furthermore, we improve it to the algorithm LinearEnumGram which runs in O(qn) time and in O(qd) space, where d is the depth of T. Hence, we can evaluate the q-gram distance Dq(T1, T2) between T 1 and T2 in O(q maxn1, n2}) time and in O(q max{d1, d2}) space, where ni and di are the number of nodes in Ti and the depth of T i, respectively. Finally, we show the relationship between the q-gram distance Dq(T1,T2) and the edit distance E(T1, T2) that Dq(T1, T2) ≤ (gl+ 1) E(T1, T2), where g = max{g1, g2}, l = max{l1, l2}, p i is the degree of Ti and li is the number of leaves in Ti. In particular, for the top-down tree edit distance F(T1, T2), this result implies that Dq(T 1, T2) ≤ min{sq-2, l-1} F(T1. T2). © Springer.Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Ohkura, N., Hirata, K., Kuboyama, T., & Harao, M. (2005). The q-gram distance for ordered unlabeled trees. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3735 LNAI, pp. 189–202). https://doi.org/10.1007/11563983_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free