Evaluation of XPath queries over XML documents using SparkSQL framework

5Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this contribution, we present our approach to querying XML document that is stored in a distributed system. The main goal of this paper is to describe how to use Spark SQL framework to implement a subset of expressions from XPath query language. Five different methods of our approach are introduced and compared, and by this, we also demonstrate the actual state of query optimization on Spark SQL platform. It may be taken as the next contribution of our paper. A subset of expressions from XPath query language (supported by the implemented methods) contains all XPath axes except the axes of attribute and namespace while predicates are not implemented in our prototype. We present our implemented system, data, measurements, tests, and results. The evaluated results support our belief that our method significantly decreases data transfers in the distributed system that occur during the query evaluation.

Author supplied keywords

Cite

CITATION STYLE

APA

Hricov, R., Šenk, A., Kroha, P., & Valenta, M. (2017). Evaluation of XPath queries over XML documents using SparkSQL framework. In Communications in Computer and Information Science (Vol. 716, pp. 28–41). Springer Verlag. https://doi.org/10.1007/978-3-319-58274-0_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free