Parallelizing Federated SPARQL Queries in Presence of Replicated Data

2Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Federated query engines have been enhanced to exploit new data localities created by replicated data, e.g., Fedra. However, existing replication aware federated query engines mainly focus on pruning sources during the source selection and query decomposition in order to reduce intermediate results thanks to data locality. In this paper, we implement a replication-aware parallel join operator: Pen. This operator can be used to exploit replicated data during query execution. For existing replication-aware federated query engines, this operator exploits replicated data to parallelize the execution of joins and reduce execution time. For Triple Pattern Fragment (TPF) clients, this operator exploits the availability of several TPF servers exposing the same dataset to share the load among the servers. We implemented Pen in the federated query engine FedX, with the replicated-aware source selection Fedra, and in the reference TPF client. We empirically evaluated the performance of engines extended with the Pen operator and the experimental results suggest that our extensions outperform the existing approaches in terms of execution time and balance of load among the servers, respectively.

Cite

CITATION STYLE

APA

Minier, T., Montoya, G., Skaf-Molli, H., & Molli, P. (2017). Parallelizing Federated SPARQL Queries in Presence of Replicated Data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10577 LNCS, pp. 181–196). Springer Verlag. https://doi.org/10.1007/978-3-319-70407-4_33

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free