CUDA memory optimizations for large data-structures in the gravit simulator

Jacob Siegel; Juergen Ributzka; Xiaoming Li

Conference ProceedingsOPEN ACCESS

CUDA memory optimizations for large data-structures in the gravit simulator

Journal of Algorithms and Computational Technology (2011) 5(2) 341-362

DOI: 10.1260/1748-3018.5.2.341

6Citations

6Readers

Abstract

Modern GPUs open a completely new field to optimize embarrassingly parallel algorithms. Implementing an algorithm on a GPU confronts the programmer with a new set of challenges for program optimization. Especially tuning the program for the GPU memory hierarchy whose organization and performance implications are radically different from those of general purpose CPUs; and optimizing programs at the instruction-level for the GPU. In this paper we analyze different approaches for optimizing the memory usage and access patterns for GPUs and propose a class of memory layout optimizations that can take full advantage of the unique memory hierarchy of NVIDIA CUDA. Furthermore, we analyze some classical optimization techniques and how they effect the performance on a GPU. We used the Gravit gravity simulator to demonstrate these optimizations. The final optimized GPU version achieves a 87 × speedup compared to the original CPU version. Almost 30% of this speedup are direct results of the optimizations discussed in this paper.

Author supplied keywords

Cite

CITATION STYLE

APA

Siegel, J., Ributzka, J., & Li, X. (2011). CUDA memory optimizations for large data-structures in the gravit simulator. In Journal of Algorithms and Computational Technology (Vol. 5, pp. 341–362). https://doi.org/10.1260/1748-3018.5.2.341

CUDA memory optimizations for large data-structures in the gravit simulator

Abstract

Author supplied keywords

Cite

Register to see more suggestions