An efficient classification approach for XML documents is introduced in this paper, which lies in combining the content with the structure of XML documents to compute the similarity between the categories and documents. It is based on the Support Vector Machine (SVM) algorithm and the Structured Link Vector Model (SLVM) which used closed frequent subtrees as the structural units. The document tree pruning strategy was applied to improve the classification system while the link information between the documents was considered to get better classification results. We did experiments on the INEX XML mining data sets combining these techniques, and the results showed that our approach performs better than any other competitor's approach on XML classification. © 2012 Springer-Verlag.
CITATION STYLE
Wang, S., Hong, Y., & Yang, J. (2012). XML document classification using closed frequent subtree. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7419 LNCS, pp. 350–359). https://doi.org/10.1007/978-3-642-33050-6_34
Mendeley helps you to discover research relevant for your work.