Supporting data shuffle between threads in openmp

4Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Both NVIDIA and AMD GPUs provide shuffle or permutation instructions to enable direct data movement between private registers of different threads. Since it doesn’t involve the shared memory or global memory on the device which are slower than direct register access, data shuffling provides opportunities of optimizing data copy to improve computing performance. However, shuffle is low-level primitive(warp- or lane-level for NVIDIA and AMD GPUs) for GPU programming. It requires advanced knowledge and skills to effectively use it. In this paper, we present two approaches of using shuffle in OpenMP, 1) a high performance runtime implementation of reduction clause using shuffle instruction; and 2) proposed shuffle extension to OpenMP to let users specify when and how the data should be moved between threads. Using sum reduction and 2D stencil as examples in our experiment, the shuffle implementation always delivers the best performance with up to 2.39x speedup compared with other high performance implementation. Compared with standard OpenMP offloading code for 2D stencil, our shuffle implementation delivers superior performance for as many as 25x better. We also provide study of simulated shuffle using shared memory on NVIDIA GPUs to demonstrate how to support this extension on hardware that has no native shuffle support.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, A., Yi, X., & Yan, Y. (2020). Supporting data shuffle between threads in openmp. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12295 LNCS, pp. 98–112). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58144-2_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free