Improving address translation in multi-GPUs via sharing and spilling aware TLB design

19Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In recent years, the ever-growing application complexity and input dataset sizes have driven the popularity of multi-GPU systems as a desirable computing platform for many application domains. While employing multiple GPUs intuitively exposes substantial parallelism for the application acceleration, the delivered performance rarely scales with the number of GPUs. One of the major challenges behind is the address translation efficiency. Many prior works focus on CPUs or single GPU execution scenarios while the address translation in multi-GPU systems receives little attention. In this paper, we conduct a comprehensive investigation of the address translation efficiency in both "single-application-multi-GPU"and "multi-application-multi-GPU"execution paradigms. Based on our observations, we propose a new TLB hierarchy design, called least- TLB, tailored for multi-GPU systems and effectively improves the TLB performance with minimal hardware overheads. Experimental results on 9 single-application workloads and 10 multi-application workloads indicate the proposed least-TLB improves the performances, on average, by 23.5% and 16.3%, respectively.

Author supplied keywords

References Powered by Scopus

ShiDianNao: Shifting vision processing closer to the sensor

701Citations
438Readers
Get full text

Cuckoo filter: Practically better than bloom

605Citations
358Readers
Get full text

Cited by Powered by Scopus

Trans-FW: Short Circuiting Page Table Walk in Multi-GPU Systems via Remote Forwarding

10Citations
11Readers
Get full text

Designing Virtual Memory System of MCM GPUs

8Citations
20Readers
Get full text

SnakeByte: A TLB Design with Adaptive and Recursive Page Merging in GPUs

8Citations
12Readers
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Li, B., Yin, J., Zhang, Y., & Tang, X. (2021). Improving address translation in multi-GPUs via sharing and spilling aware TLB design. In Proceedings of the Annual International Symposium on Microarchitecture, MICRO (pp. 1154–1168). IEEE Computer Society. https://doi.org/10.1145/3466752.3480083

Readers over time

‘21‘22‘23‘24‘250481216

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 11

85%

Lecturer / Post doc 1

8%

Researcher 1

8%

Readers' Discipline

Tooltip

Computer Science 12

80%

Engineering 2

13%

Mathematics 1

7%

Save time finding and organizing research with Mendeley

Sign up for free
0