An Early Performance Comparison of CUDA and OpenACC

6Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

This paper presents a performance comparison between CUDA and OpenACC. The performance analysis focuses on programming models and underlying compilers. In addition, we proposed a Performance Ratio of Data Sensitivity (PRoDS) metric to objectively compare traditional subjective performances: how sensitive OpenACC and CUDA implementations are to change in data size. The results show that in terms of kernel running time, the OpenACC performance is lower than the CUDA performance because PGI compiler needs to translate OpenACC kernels into object code while CUDA codes can be directly run. Besides, OpenACC programs are more sensitive to data changes than the equivalent CUDA programs with optimizations, but CUDA is more sensitive to data changes than OpenACC if there are no optimizations. Overall we found that OpenACC is a reliable programming model and a good alternative to CUDA for accelerator devices.

Cite

CITATION STYLE

APA

Li, X., & Shih, P. C. (2018). An Early Performance Comparison of CUDA and OpenACC. In MATEC Web of Conferences (Vol. 208). EDP Sciences. https://doi.org/10.1051/matecconf/201820805002

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free