Khuzdul: Efficient and Scalable Distributed Graph Pattern Mining Engine

5Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

This paper proposes Khuzdul, a distributed execution engine with a well-defined abstraction that can be integrated with existing single-machine graph pattern mining (GPM) systems to provide efficiency and scalability at the same time. The key novelty is the extendable embedding abstraction which can express pattern enumeration algorithms, allow fine-grained task scheduling, and enable low-cost GPM-specific data reuse to reduce communication cost. The effective BFS-DFS hybrid exploration generates sufficient concurrent tasks for communication-computation overlapping with bounded memory consumption. Two scalable distributed GPM systems are implemented by porting Automine and GraphPi on Khuzdul. Our evaluation shows that Khuzdul based systems significantly outperform state-of-the-art distributed GPM systems with partitioned graphs by up to 75.5× (on average 19.0×), achieve similar or even better performance compared with the fastest distributed GPM systems with replicated graph, and scale to massive graphs with more than one hundred billion edges with a commodity cluster.

Cite

CITATION STYLE

APA

Chen, J., & Qian, X. (2023). Khuzdul: Efficient and Scalable Distributed Graph Pattern Mining Engine. In International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS (Vol. 2, pp. 413–426). Association for Computing Machinery. https://doi.org/10.1145/3575693.3575743

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free