HybridTreeMiner: An efficient algorithm for mining frequent rooted trees and free trees using canonical forms

  • Chi Y
  • Yang Y
  • Muntz R
  • 4

    Readers

    Mendeley users who have this article in their library.
  • N/A

    Citations

    Citations of this article.

Abstract

Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. In this paper, we present HybridTreeMiner, a computationally efficient algorithm that discovers all frequently occurring subtrees in a database of rooted unordered trees. The algorithm mines frequent subtrees by traversing an enumeration tree that systematically enumerates all subtrees. The enumeration tree is defined based on a novel canonical form for rooted unordered trees-the breadth-first canonical form (BFCF). By extending the definitions of our canonical form and enumeration tree to free trees, our algorithm can efficiently handle databases of free trees as well. We study the performance of our algorithms through extensive experiments based on both synthetic data and datasets from real applications. The experiments show that our algorithm is competitive in comparison to known rooted tree mining algorithms and is faster by one to two orders of magnitudes compared to a known algorithm for mining frequent free trees.

Author-supplied keywords

  • canonical form
  • enumeration tree
  • free
  • frequent subtree
  • morphism
  • rooted unordered tree
  • tree iso-

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Y Chi

  • Y Yang

  • Rr Muntz

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free