Ingredients for Accurate, Fast, and Robust XML Similarity Joins

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We consider the problem of answering similarity join queries on large, non-schematic, heterogeneous XML datasets. Realizing similarity joins on such datasets is challenging, because the semi-structured nature of XML substantially increases the complexity of the underlying similarity function in terms of both effectiveness and efficiency. Moreover, even the selection of pieces of information for similarity assessment is complicated because these can appear at different parts among documents in a dataset. In this paper, we present an approach that jointly calculates textual and structural similarity of XML trees while implicitly embedding similarity selection into join processing. We validate the accuracy, performance, and scalability of our techniques with a set of experiments in the context of an XML DBMS. © 2011 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Ribeiro, L. A., & Härder, T. (2011). Ingredients for Accurate, Fast, and Robust XML Similarity Joins. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6861 LNCS, pp. 33–42). https://doi.org/10.1007/978-3-642-23091-2_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free