Abstract
Recent studies have shown promising performance benefits whenmultiple stages of a pipelined stencil application are mapped todifferent parts of a GPU to run concurrently. An important factorfor the computing efficiency of such pipelines is the granularityof a task. In previous programming frameworks that support truepipelined computations on GPU, the choice has to be made bythe programmers during the application development time. Dueto many difficulties, programmers' decisions are often far fromoptimal, causing inferior performance and performance portability.This paper presents GOPipe, a granularity-oblivious programming framework for efficient pipelined stencil executions on GPU.With GOPipe, programmers no longer need to specify the appropriate task granularity. GOPipe automatically finds it, and dynamicallyschedules tasks of that granularity for efficiency while observingall inter-task and inter-stage data dependencies. In our experimentson six real-life applications and various scenarios, GOPipe outperforms the state-of-the-art system by 1.39× on average with a muchbetter programming productivity.
Author supplied keywords
Cite
CITATION STYLE
Oh, C., Zheng, Z., Shen, X., Zhai, J., & Yi, Y. (2020). GOPipe: A granularity-oblivious programming framework for pipelined stencil executions on GPU. In Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT (pp. 43–54). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3410463.3414656
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.