Warps and atomics: Beyond barrier synchronization in the verification of GPU kernels

Ethel Bardsley; Alastair F. Donaldson

Conference Proceedings

Warps and atomics: Beyond barrier synchronization in the verification of GPU kernels

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8430 LNCS 230-245

DOI: 10.1007/978-3-319-06200-6_18

26Citations

16Readers

Get full text

Abstract

We describe the design and implementation of methods to support reasoning about data races in GPU kernels where constructs other than the standard barrier primitive are used for synchronization. At one extreme we consider kernels that exploit implicit, coarse-grained synchronization between threads in the same warp, a feature provided by many architectures. At the other extreme we consider kernels that reduce or avoid barrier synchronization through the use of atomic operations. We discuss design decisions associated with providing support for warps and atomics in GPUVerify, a formal verification tool for OpenCL and CUDA kernels. We evaluate the practical impact of these design decisions using a large set of benchmarks, showing that warps can be supported in a scalable manner, that a coarse abstraction suffices for efficient reasoning about most practical uses of atomic operations, and that a novel, refined abstraction captures an important design pattern where atomic operations are used to compute unique array indices. Our evaluation revealed two previously unknown bugs in publicly available benchmark suites. © 2014 Springer International Publishing.

Cite

CITATION STYLE

APA

Bardsley, E., & Donaldson, A. F. (2014). Warps and atomics: Beyond barrier synchronization in the verification of GPU kernels. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8430 LNCS, pp. 230–245). Springer Verlag. https://doi.org/10.1007/978-3-319-06200-6_18

Warps and atomics: Beyond barrier synchronization in the verification of GPU kernels

Abstract

Cite

Register to see more suggestions