Quantifying Data Locality in Dynamic Parallelism in GPUs

Xulong Tang; Ashutosh Pattnaik; Onur Kayiran; Adwait Jog; Mahmut Taylan Kandemir; Chita Das

Journal ArticleOPEN ACCESS

Quantifying Data Locality in Dynamic Parallelism in GPUs

Performance Evaluation Review (2019) 47(1) 25-26

DOI: 10.1145/3309697.3331473

2Citations

6Readers

Abstract

Dynamic parallelism (DP) is a new feature of emerging GPUs that allows new kernels to be generated and scheduled from the deviceside (GPU) without the host-side (CPU) intervention. To eiciently support DP, one of the major challenges is to saturate the GPU processing elements and provide them with the required data in a timely fashion. In this paper, we irst conduct a limit study on the performance improvements that can be achieved by hardware schedulers that are provided with accurate data reuse information. We next propose LASER, a Locality-Aware SchedulER, where the hardware schedulers employ data reuse monitors to help make scheduling decisions to improve data locality at runtime. Experimental results on 16 benchmarks show that LASER, on an average, can improve performance by 11.3%.

Author supplied keywords

Cite

CITATION STYLE

APA

Tang, X., Pattnaik, A., Kayiran, O., Jog, A., Kandemir, M. T., & Das, C. (2019). Quantifying Data Locality in Dynamic Parallelism in GPUs. Performance Evaluation Review, 47(1), 25–26. https://doi.org/10.1145/3309697.3331473

Quantifying Data Locality in Dynamic Parallelism in GPUs

Abstract

Author supplied keywords

Cite

Register to see more suggestions