A case study of porting hpgmg from cuda to openmp target offload

Christopher Daley; Hadia Ahmed; Samuel Williams; Nicholas Wright

Conference Proceedings

A case study of porting hpgmg from cuda to openmp target offload

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12295 LNCS 37-51

DOI: 10.1007/978-3-030-58144-2_3

10Citations

6Readers

Get full text

Abstract

The HPGMG benchmark is a non-trivial Multigrid benchmark used to evaluate system performance. We ported this benchmark from CUDA to OpenMP target offload and added the capability to use explicit data management rather than managed memory. Our optimized OpenMP target offload implementation obtains a performance of 0.73x and 2.04x versus the baseline CUDA version on two different node architectures with NVIDIA Volta GPUs. We explain how we successfully used OpenMP target offload, including the code refactoring required, and how we improved upon our initial performance with LLVM/Clang by 97x.

Author supplied keywords

Cite

CITATION STYLE

APA

Daley, C., Ahmed, H., Williams, S., & Wright, N. (2020). A case study of porting hpgmg from cuda to openmp target offload. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12295 LNCS, pp. 37–51). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58144-2_3

A case study of porting hpgmg from cuda to openmp target offload

Abstract

Author supplied keywords

Cite

Register to see more suggestions