CUDA 2D stencil computations for the Jacobi method

José María Cecilia; José Manuel García; Manuel Ujaldón

Conference Proceedings

CUDA 2D stencil computations for the Jacobi method

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7133 LNCS(PART 1) 173-183

DOI: 10.1007/978-3-642-28151-8_17

8Citations

47Readers

Get full text

Abstract

We are witnessing the consolidation of the GPUs streaming paradigm in parallel computing. This paper explores stencil operations in CUDA to optimize on GPUs the Jacobi method for solving Laplace's differential equation. The code keeps constant the access pattern through a large number of loop iterations, that way being representative of a wide set of iterative linear algebra algorithms. Optimizations are focused on data parallelism, threads deployment and the GPU memory hierarchy, whose management is explicit by the CUDA programmer. Experimental results are shown on Nvidia Teslas C870 and C1060 GPUs and compared to a counterpart version optimized on a quadcore Intel CPU. The speed-up factor for our set of GPU optimizations reaches 3-4x and the execution times defeat those of the CPU by a wide margin, also showing great scalability when moving towards a more sophisticated GPU architecture and/or more demanding problem sizes. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Cecilia, J. M., García, J. M., & Ujaldón, M. (2012). CUDA 2D stencil computations for the Jacobi method. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7133 LNCS, pp. 173–183). https://doi.org/10.1007/978-3-642-28151-8_17

CUDA 2D stencil computations for the Jacobi method

Abstract

Author supplied keywords

Cite

Register to see more suggestions