Improving address translation in multi-GPUs via sharing and spilling aware TLB design

Bingyao Li; Jieming Yin; Youtao Zhang; Xulong Tang

Conference ProceedingsOPEN ACCESS

Improving address translation in multi-GPUs via sharing and spilling aware TLB design

Proceedings of the Annual International Symposium on Microarchitecture, MICRO (2021) 1154-1168

DOI: 10.1145/3466752.3480083

19Citations

27Readers

Get full text

Abstract

In recent years, the ever-growing application complexity and input dataset sizes have driven the popularity of multi-GPU systems as a desirable computing platform for many application domains. While employing multiple GPUs intuitively exposes substantial parallelism for the application acceleration, the delivered performance rarely scales with the number of GPUs. One of the major challenges behind is the address translation efficiency. Many prior works focus on CPUs or single GPU execution scenarios while the address translation in multi-GPU systems receives little attention. In this paper, we conduct a comprehensive investigation of the address translation efficiency in both "single-application-multi-GPU"and "multi-application-multi-GPU"execution paradigms. Based on our observations, we propose a new TLB hierarchy design, called least- TLB, tailored for multi-GPU systems and effectively improves the TLB performance with minimal hardware overheads. Experimental results on 9 single-application workloads and 10 multi-application workloads indicate the proposed least-TLB improves the performances, on average, by 23.5% and 16.3%, respectively.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Li, B., Yin, J., Zhang, Y., & Tang, X. (2021). Improving address translation in multi-GPUs via sharing and spilling aware TLB design. In Proceedings of the Annual International Symposium on Microarchitecture, MICRO (pp. 1154–1168). IEEE Computer Society. https://doi.org/10.1145/3466752.3480083

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 11

85%

Lecturer / Post doc 1

Researcher 1

Readers' Discipline

Computer Science 12

80%

Engineering 2

13%

Mathematics 1

Improving address translation in multi-GPUs via sharing and spilling aware TLB design

Abstract

Author supplied keywords

References Powered by Scopus

ShiDianNao: Shifting vision processing closer to the sensor

Cuckoo filter: Practically better than bloom

A survey of CPU-GPU heterogeneous computing techniques

Cited by Powered by Scopus

Trans-FW: Short Circuiting Page Table Walk in Multi-GPU Systems via Remote Forwarding

Designing Virtual Memory System of MCM GPUs

SnakeByte: A TLB Design with Adaptive and Recursive Page Merging in GPUs

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline