This chapter describes the extension of the TMG framework for the mining of unordered induced/embedded subtrees. While in online tree-structured documents such as XML the information is presented in a particular order, in many applications the order among the sibling-nodes is considered unimportant or irrelevant to the task and is often not available. If one is interested in comparing different document structures, or the document is composed of data from several heterogeneous sources, it is very common for the order of sibling nodes to differ, although the information contained in the structure is essentially the same. In these cases, mining of unordered subtrees is much more suitable as a user can pose queries and does not have to worry about the order. All matching sub-structures will be returned with the difference being that the order of sibling nodes is not used as an additional candidate grouping criterion. Hence, the main difference when it comes to the mining of unordered subtrees is that the order of sibling nodes of a subtree can be exchanged and the resulting tree is still considered the same. © 2011 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Hadzic, F., Tan, H., & Dillon, T. S. (2011). TMG framework for mining unordered subtrees. Studies in Computational Intelligence, 333, 139–174. https://doi.org/10.1007/978-3-642-17557-2_6
Mendeley helps you to discover research relevant for your work.