XRules: An effective algorithm for structural classification of XML data

48Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

XML documents have recently become ubiquitous because of their varied applicability in a number of applications. Classification is an important problem in the data mining domain, but current classification methods for XML documents use IR-based methods in which each document is treated as a bag of words. Such techniques ignore a significant amount of information hidden inside the documents. In this paper we discuss the problem of rule based classification of XML data by using frequent discriminatory substructures within XML documents. Such a technique is more capable of finding the classification characteristics of documents. In addition, the technique can also be extended to cost sensitive classification. We show the effectiveness of the method with respect to other classifiers. We note that the methodology discussed in this paper is applicable to any kind of semi-structured data.

Cite

CITATION STYLE

APA

Zaki, M. J., & Aggarwal, C. C. (2006). XRules: An effective algorithm for structural classification of XML data. In Machine Learning (Vol. 62, pp. 137–170). https://doi.org/10.1007/s10994-006-5832-2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free