Accelerator-based heterogeneous computing is gaining momentum in High Performance Computing arena. However, the increased complexity of the accelerator architectures demands more generic, highlevel programming models. OpenACC is one such attempt to tackle the problem. While the abstraction endowed by OpenACC offers productivity, it raises questions on its portability. This paper evaluates the performance portability obtained by OpenACC on twelve OpenACC programs on NVIDIA CUDA, AMD GCN, and Intel MIC architectures. We study the effects of various compiler optimizations and OpenACC program settings on these architectures to provide insights into the achieved performance portability.
CITATION STYLE
Sabne, A., Sakdhnagool, P., Lee, S., & Vetter, J. S. (2015). Evaluating performance portability of OpenACC. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8967, pp. 51–66). Springer Verlag. https://doi.org/10.1007/978-3-319-17473-0_4
Mendeley helps you to discover research relevant for your work.