Grus

  • Wang P
  • Wang J
  • Li C
  • et al.
N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

Today’s GPU graph processing frameworks face scalability and efficiency issues as the graph size exceeds GPU-dedicated memory limit. Although recent GPUs can over-subscribe memory with Unified Memory (UM), they incur significant overhead when handling graph-structured data. In addition, many popular processing frameworks suffer sub-optimal efficiency due to heavy atomic operations when tracking the active vertices. This article presents Grus, a novel system framework that allows GPU graph processing to stay competitive with the ever-growing graph complexity. Grus improves space efficiency through a UM trimming scheme tailored to the data access behaviors of graph workloads. It also uses a lightweight frontier structure to further reduce atomic operations. With easy-to-use interface that abstracts the above details, Grus shows up to 6.4× average speedup over the state-of-the-art in-memory GPU graph processing framework. It allows one to process large graphs of 5.5 billion edges in seconds with a single GPU.

Cite

CITATION STYLE

APA

Wang, P., Wang, J., Li, C., Wang, J., Zhu, H., & Guo, M. (2021). Grus. ACM Transactions on Architecture and Code Optimization, 18(2), 1–25. https://doi.org/10.1145/3444844

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free