Block-Size Independence for GPU Programs

4Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Optimizing GPU programs by tuning execution parameters is essential to realizing the full performance potential of GPU hardware. However, many of these optimizations do not ensure correctness and subtle errors can enter while optimizing a GPU program. Further, lack of formal models and the presence of non-trivial transformations prevent verification of optimizations. In this work, we verify transformations involved in tuning the execution parameter, block-size. First, we present a formal programming and execution model for GPUs, and then formalize block-size independence of GPU programs, which ensures tuning block-size preserves program semantics. Next, we present an inter-procedural analysis to verify block-size independence for synchronization-free GPU programs. Finally, we evaluate the analysis on the Nvidia CUDA SDK samples, where 35 global kernels are verified to be block-size independent.

Cite

CITATION STYLE

APA

Alur, R., Devietti, J., & Singhania, N. (2018). Block-Size Independence for GPU Programs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11002 LNCS, pp. 107–126). Springer Verlag. https://doi.org/10.1007/978-3-319-99725-4_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free