In this paper we describe the process of porting the Nek- Bone mini-application to run on a Cray XC30 hybrid supercomputer using OpenMP device constructs, as introduced in version 4.0 of the OpenMP standard and implemented in a pre-release version of the Cray Compilation Environment (CCE) compiler. We document the process of porting and show how the performance evolves during the addition on the 66 constructs needed to accelerate the application. In doing so, we provide a user-centric introduction to the device constructs and an overview of the approach needed to port a parallel application using these. Some contrasts with OpenACC are also drawn to aid those wishing to either implement both programming models or to migrate from one to the other.
Hart, A. (2015). First experiences porting a parallel application to a hybrid supercomputer with OpenMP4.0 device constructs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9342, pp. 73–85). Springer Verlag. https://doi.org/10.1007/978-3-319-24595-9_6