GTuner: Tuning DNN Computations on GPU via Graph Attention Network

Qi Sun; Xinyun Zhang; Hao Geng; Yuxuan Zhao; Yang Bai; Haisheng Zheng; Bei Yu

Conference ProceedingsOPEN ACCESS

GTuner: Tuning DNN Computations on GPU via Graph Attention Network

Proceedings - Design Automation Conference (2022) 1045-1050

DOI: 10.1145/3489517.3530584

5Citations

5Readers

Get full text

Abstract

It is an open problem to compile DNN models on GPU and improve the performance. A novel framework, GTuner, is proposed to jointly learn from the structures of computational graphs and the statistical features of codes to find the optimal code implementations. A Graph ATtention network (GAT) is designed as the performance estimator in GTuner. In GAT, graph neural layers are used to propagate the information in the graph and a multi-head self-attention module is designed to learn the complicated relationships between the features. Under the guidance of GAT, the GPU codes are generated through auto-tuning. Experimental results demonstrate that our method outperforms the previous arts remarkably.

Cite

CITATION STYLE

APA

Sun, Q., Zhang, X., Geng, H., Zhao, Y., Bai, Y., Zheng, H., & Yu, B. (2022). GTuner: Tuning DNN Computations on GPU via Graph Attention Network. In Proceedings - Design Automation Conference (pp. 1045–1050). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3489517.3530584

GTuner: Tuning DNN Computations on GPU via Graph Attention Network

Abstract

Cite

Register to see more suggestions