XML document classification using closed frequent subtree

Songlin Wang; Yihong Hong; Jianwu Yang

Conference Proceedings

XML document classification using closed frequent subtree

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7419 LNCS 350-359

DOI: 10.1007/978-3-642-33050-6_34

3Citations

2Readers

Get full text

Abstract

An efficient classification approach for XML documents is introduced in this paper, which lies in combining the content with the structure of XML documents to compute the similarity between the categories and documents. It is based on the Support Vector Machine (SVM) algorithm and the Structured Link Vector Model (SLVM) which used closed frequent subtrees as the structural units. The document tree pruning strategy was applied to improve the classification system while the link information between the documents was considered to get better classification results. We did experiments on the INEX XML mining data sets combining these techniques, and the results showed that our approach performs better than any other competitor's approach on XML classification. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, S., Hong, Y., & Yang, J. (2012). XML document classification using closed frequent subtree. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7419 LNCS, pp. 350–359). https://doi.org/10.1007/978-3-642-33050-6_34

XML document classification using closed frequent subtree

Abstract

Author supplied keywords

Cite

Register to see more suggestions