Asynchronous AMR on Multi-GPUs

Muhammad Nufail Farooqi; Tan Nguyen; Weiqun Zhang; Ann S. Almgren; John Shalf; Didem Unat

Conference Proceedings

Asynchronous AMR on Multi-GPUs

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11887 LNCS 113-123

DOI: 10.1007/978-3-030-34356-9_11

0Citations

1Readers

Get full text

Abstract

Adaptive Mesh Refinement (AMR) is a computational and memory efficient technique for solving partial differential equations. As many of the supercomputers employ GPUs in their systems, AMR frameworks have to be evolved to adapt to large-scale heterogeneous systems. However, it is challenging to employ multiple GPUs and achieve good scalability in AMR because of its complex communication pattern. In this paper, we present our asynchronous AMR runtime system that simultaneously schedules tasks on both CPUs and GPUs and coordinates data movement between different processing units. Our runtime is adaptive to various machine configurations and uses a host resident data model. It helps facilitate using streams to overlap CPU-GPU data transfers with computation and increase device occupancy. We perform strong and weak scaling studies using an Advection solver on Piz Daint supercomputer and achieve high performance.

Author supplied keywords

Cite

CITATION STYLE

APA

Farooqi, M. N., Nguyen, T., Zhang, W., Almgren, A. S., Shalf, J., & Unat, D. (2019). Asynchronous AMR on Multi-GPUs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11887 LNCS, pp. 113–123). Springer. https://doi.org/10.1007/978-3-030-34356-9_11

Asynchronous AMR on Multi-GPUs

Abstract

Author supplied keywords

Cite

Register to see more suggestions