NeutronStar: Distributed GNN Training with Hybrid Dependency Management

Qiange Wang; Yanfeng Zhang; Hao Wang; Chaoyi Chen; Xiaodong Zhang; Ge Yu

Conference ProceedingsOPEN ACCESS

NeutronStar: Distributed GNN Training with Hybrid Dependency Management

Proceedings of the ACM SIGMOD International Conference on Management of Data (2022) 1301-1315

DOI: 10.1145/3514221.3526134

47Citations

12Readers

Get full text

Abstract

GNN's training needs to resolve issues of vertex dependencies, i.e., each vertex representation's update depends on its neighbors. Existing distributed GNN systems adopt either a dependencies-cached approach or a dependencies-communicated approach. Having made intensive experiments and analysis, we find that a decision to choose one or the other approach for the best performance is determined by a set of factors, including graph inputs, model configurations, and an underlying computing cluster environment. If various GNN trainings are supported solely by one approach, the performance results are often suboptimal. We study related factors for each GNN training before its execution to choose the best-fit approach accordingly. We propose a hybrid dependency-handling approach that adaptively takes the merits of the two approaches at runtime. Based on the hybrid approach, we further develop a distributed GNN training system called NeutronStar, which makes high performance GNN trainings in an automatic way. NeutronStar is also empowered by effective optimizations in CPU-GPU computation and data processing. Our experimental results on 16-node Aliyun cluster demonstrate that NeutronStar achieves 1.81X-14.25X speedup over existing GNN systems including DistDGL and ROC.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, Q., Zhang, Y., Wang, H., Chen, C., Zhang, X., & Yu, G. (2022). NeutronStar: Distributed GNN Training with Hybrid Dependency Management. In Proceedings of the ACM SIGMOD International Conference on Management of Data (pp. 1301–1315). Association for Computing Machinery. https://doi.org/10.1145/3514221.3526134

NeutronStar: Distributed GNN Training with Hybrid Dependency Management

Abstract

Author supplied keywords

Cite

Register to see more suggestions