A Quantitative Study of Locality in GPU Caches

Sohan Lal; Ben Juurlink

Conference Proceedings

A Quantitative Study of Locality in GPU Caches

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12471 LNCS 228-242

DOI: 10.1007/978-3-030-60939-9_16

2Citations

4Readers

Get full text

Abstract

Traditionally, GPUs only had programmer-managed caches. The advent of hardware-managed caches accelerated the use of GPUs for general-purpose computing. However, as GPU caches are shared by thousands of threads, they are usually a victim of contention and can suffer from thrashing and high miss rate, in particular, for memory-divergent workloads. As data locality is crucial for performance, there have been several efforts focusing on exploiting data locality in GPUs. However, there is a lack of quantitative analysis of data locality and data reuse in GPUs. In this paper, we quantitatively study the data locality and its limits in GPUs. We observe that data locality is much higher than exploited by current GPUs. We show that, on the one hand, the low spatial utilization of cache lines justifies the use of demand-fetched caches. On the other hand, the much higher actual spatial utilization of cache lines shows the lost spatial locality and presents opportunities for further optimizing the cache design.

Author supplied keywords

Cite

CITATION STYLE

APA

Lal, S., & Juurlink, B. (2020). A Quantitative Study of Locality in GPU Caches. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12471 LNCS, pp. 228–242). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60939-9_16

A Quantitative Study of Locality in GPU Caches

Abstract

Author supplied keywords

Cite

Register to see more suggestions