In modern data centers, massive concurrent graph processing jobs are being processed on large graphs. However, existing hardware/-software solutions suffer from irregular graph traversal and intense resource contention. In this paper, we propose LCCG, a Locality-Centric programmable accelerator that augments the many-core processor for achieving higher throughput of Concurrent Graph processing jobs. Specifically, we develop a novel topology-Aware execution approach into the accelerator design to regularize the graph traversals for multiple jobs on-The-fly according to the graph topology, which is able to fully consolidate the graph data accesses from concurrent jobs. By reusing the same graph data among more jobs and coalescing the accesses of the vertices states for these jobs, LCCG can improve the core utilization. We conduct extensive experiments on a simulated 64-core processor. The results show that LCCG improves the throughput of the cutting-edge software system by 11.323.9 times with only 0.5% additional area cost. Moreover, LCCG gains the speedups of 4.710.3, 5.513.2, and 3.88.4 times over state-of-The-Art hardware graph processing accelerators (namely, HATS, Minnow, and PHI, respectively).
CITATION STYLE
Zhao, J., Zhang, Y., Liao, X., He, L., He, B., Jin, H., & Liu, H. (2021). LCCG: A Locality-Centric Hardware Accelerator for High Throughput of Concurrent Graph Processing. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC. IEEE Computer Society. https://doi.org/10.1145/3458817.3480854
Mendeley helps you to discover research relevant for your work.