Parallelizing Neural Network Models Effectively on GPU by Implementing Reductions Atomically

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

Due to the missing of a good orchestration of loop transformations, existing optimizing compilers for deploying neural networks on GPU either parallelize reductions ineffectively or miss the fusion opportunities with other operators. Neural network models thus exhibit sub-optimal performance on GPU. We present a practical approach called Panamera for the effective parallelization of reductions in neural networks on GPU. Panamera frst leverages loop coalescing to flatten the loop dimensions of reductions, converting all reduction operators into canonical forms eligible for the polyhedral model. Next, Panamera uses polyhedral transformations to reduce the data movements caused by unfused reductions and perform multi-block hardware binding not considered by many compilers. Finally, Panamera embeds a highly optimized routine implemented using GPU atomic instructions, further improving the performance of neural network models while guaranteeing the correctness of parallel reductions. The experimental results demonstrate the effectiveness of our approach: for single operators our code obtains a mean speedup of 33.7×, 3.5×, 5.4× and 9.6× over cuDNN, CUB, TVM and Ansor, for sub-graphs our approach outperforms cuDNN, TVM and Ansor by 9.5×, 2.6× and 2.7×, and for end-to-end workloads, a tensor compiler integrated with our approach outperforms them by 122.5%, 19.3% and 15.2%.

Cite

CITATION STYLE

APA

Zhao, J., Bastoul, C., Yi, Y., Hu, J., Nie, W., Zhang, R., … Gan, Z. (2022). Parallelizing Neural Network Models Effectively on GPU by Implementing Reductions Atomically. In Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT (pp. 451–466). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3559009.3569656

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free