A GPU implementation for solving the convection diffusion equation using the local modified SOR method

4Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this chapter we describe a parallel CUDA implementation of the SOR method for the numerical solution of the Convection Diffusion equation suitable for GPUs. We demonstrate two generally applicable programming techniques, memory reordering as a means of coalescing and recomputation of stored data as a means of alleviating the memory bandwidth bottleneck and increasing the feasible problem size. We focus on the local relaxation version of SOR. In particular we apply the local Modified SOR method (LMSOR) which possesses a better rate of convergence than SOR. We present our CUDA implementations with applied optimizations of the LMSOR method focused on exploiting the computational capabilities of modern GPUs. In addition we supply performance results of GPUs based on Fermi and Kepler architectures and a contemporary quad core CPU. The CPU implementation is parallelized with OpenMP utilizing manual AVX (Advanced Vector Extensions) vectorization. The results regarding recomputation look quite promising and we expect that it will be of more significance in the near future.

Cite

CITATION STYLE

APA

Cotronis, Y., Konstantinidis, E., & Missirlis, N. M. (2014). A GPU implementation for solving the convection diffusion equation using the local modified SOR method. In Numerical Computations with GPUs (pp. 207–221). Springer International Publishing. https://doi.org/10.1007/978-3-319-06548-9_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free