Scaling OpenMP for Exascale Performance and Portability

Bronis R. de Supinski; Stephen L. Olivier; Christian Terboven; Barbara M. Chapman; Matthias S. Müller

Journal Article

Scaling OpenMP for Exascale Performance and Portability

Supinski B
Olivier S
Terboven C
et al.

13th International Workshop on OpenMP, IWOMP 2017 Stony Brook, NY, USA, September 20–22, 2017 Proceedings (2017) 1(Part II) 17-29

N/ACitations

13Readers

Abstract

Given their massively parallel computing capabilities heterogeneous architectures comprised of CPUs and accelerators have been increasingly used to speed-up scientific and engineering applications. Nevertheless, programming such architectures is a challenging task for most non-expert programmers as typical accelerator programming languages (e.g. CUDA and OpenCL) demand a thoroughly understanding of the underlying hardware to enable an effective application speed-up. To achieve that, programmers are usually required to significantly change and adapt program structures and algorithms, thus impacting both performance and productivity. A simpler alternative is to use high-level directive-based programming models like OpenACC and OpenMP. These models allow programmers to insert both directives and runtime calls into existing source code, thus providing hints to the compiler and runtime to perform certain transformations and optimizations on the annotated code regions. In this paper, we present ACLang, an open-source LLVM/Clang compiler framework (http://www.aclang.org) that implements the recently released OpenMP 4.X Accelerator Programming Model. ACLang automatically converts OpenMP 4.X annotated program regions into OpenCL/SPIR kernels, while providing a set of polyhedral based optimizations like tiling and vectorization. OpenCL kernels resulting from ACLang can be executed on any OpenCL/SPIR compatible acceleration device, not only GPUs, but also FPGA accelerators like those found in the Intel HARP architecture. To the best of our knowledge and at the time this paper was written, this is the first LLVM/Clang implementation of the OpenMP 4.X Accelerator Model that provides a source-to-target OpenCL conversion.

Author supplied keywords

Cite

CITATION STYLE

APA

Supinski, B. R. de, Olivier, S. L., Terboven, C., Chapman, B. M., & Müller, M. S. (2017). Scaling OpenMP for Exascale Performance and Portability. 13th International Workshop on OpenMP, IWOMP 2017 Stony Brook, NY, USA, September 20–22, 2017 Proceedings, 1(Part II), 17–29. Retrieved from http://link.springer.com/10.1007/978-3-319-65578-9

Scaling OpenMP for Exascale Performance and Portability

Abstract

Author supplied keywords

Cite

Register to see more suggestions