Mining class hierarchies from XML data: Representation techniques

0Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we describe a technique for extracting patterns to a XML data flow; then, we show how such patterns can be developed into an ontology of classes. Also, we discuss the impact of different fuzzy representation techniques for XML data on the outcome of our procedure. One might wonder why all this is needed, since the semantics of XML data could in principle be satisfactorily represented via their associated XML schemata ComplexTypes. Unfortunately it turns out that standard XML schema definitions need to cover a wide repertoire of possible attributes. For this reason, optional elements are widely used, thus decreasing the expressiveness of XML schemata as descriptors of the content of single instances. Our approach relies on comparing fuzzy encodings of XML fragments. This comparison will allow us to define "typical" sets of attributes, that we shall consider hints to possible meaningful classes. Then, we shall evaluate fuzzy overlapping between candidate cluster heads in order to define a tentative class hierarchy. Our fuzzy modelling assumes that a domain expert has associated an importance degree in the [0, 1] interval to vocabulary elements (i.e. tag names). As we shall see in the remainder of the paper, this burden is not excessive, since this importance assessment only needs to be carried out once, looking at the schema. At run time, each incoming XML fragment is mapped into a fuzzy set whose elements are the tag names [3]. Each element membership is computed by aggregating the vocabulary importance values of the tags lying on the path from it to the root. The topology of the individual XML tree is modelled by using an aggregation that takes into account nodes nesting level or nodes occurrence. Our procedure consists of the following steps. © 2006 Springer.

Cite

CITATION STYLE

APA

Ceravolo, P., & Damiani, E. (2006). Mining class hierarchies from XML data: Representation techniques. Advances in Soft Computing, 33, 385–396. https://doi.org/10.1007/3-540-31182-3_36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free