GTuner: Tuning DNN Computations on GPU via Graph Attention Network

5Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

It is an open problem to compile DNN models on GPU and improve the performance. A novel framework, GTuner, is proposed to jointly learn from the structures of computational graphs and the statistical features of codes to find the optimal code implementations. A Graph ATtention network (GAT) is designed as the performance estimator in GTuner. In GAT, graph neural layers are used to propagate the information in the graph and a multi-head self-attention module is designed to learn the complicated relationships between the features. Under the guidance of GAT, the GPU codes are generated through auto-tuning. Experimental results demonstrate that our method outperforms the previous arts remarkably.

Cite

CITATION STYLE

APA

Sun, Q., Zhang, X., Geng, H., Zhao, Y., Bai, Y., Zheng, H., & Yu, B. (2022). GTuner: Tuning DNN Computations on GPU via Graph Attention Network. In Proceedings - Design Automation Conference (pp. 1045–1050). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3489517.3530584

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free