Cluster computing, recursion and datalog

Foto N. Afrati; Vinayak Borkar; Michael Carey; Neoklis Polyzotis; Jeffrey D. Ullman

Conference Proceedings

Cluster computing, recursion and datalog

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6702 LNCS 120-144

DOI: 10.1007/978-3-642-24206-9_8

6Citations

13Readers

Get full text

Abstract

The cluster-computing environment typified by Hadoop, the open-source implementation of map-reduce, is receiving serious attention as the way to execute queries and other operations on very large-scale data. Datalog execution presents several unusual issues for this enviroment. We discuss the best way to execute a round of seminaive evaluation on a computing cluster using the map-reduce. Using transitive closure as an example, we examine the cost of executing recursions in several different ways. Recursive processes such as evaluation of a recursive Datalog program do not fit the key map-reduce assumption that tasks deliver output only when they are completed. As a result, the resilience under compute-node failure that is a key element of the map-reduce framework is not supported for recursive programs. We discuss extensions to this framework that are suitable for executing recursive Datalog programs on very large-scale data in a way that allows progress to continue after node failures, without restarting the entire job. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Afrati, F. N., Borkar, V., Carey, M., Polyzotis, N., & Ullman, J. D. (2011). Cluster computing, recursion and datalog. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6702 LNCS, pp. 120–144). https://doi.org/10.1007/978-3-642-24206-9_8

Cluster computing, recursion and datalog

Abstract

Cite

Register to see more suggestions