Fireiron: A data-movement-aware scheduling language for GPUs

Bastian Hagedorn; Archibald Samuel Elliott; Henrik Barthels; Rastislav Bodik; Vinod Grover

Conference ProceedingsOPEN ACCESS

Fireiron: A data-movement-aware scheduling language for GPUs

Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT (2020) 71-82

DOI: 10.1145/3410463.3414632

15Citations

14Readers

Get full text

Abstract

High GPU performance can only be achieved if a kernel efficientlyuses the multi-layered compute and memory hierarchies. For example, accelerators such as NVIDIA's Tensor Cores require specificmappings of threads to data that must be considered in data movements to and from registers. Current compilers struggle to matchthe performance of vendor libraries like cuBLAS, which are developed by experts in assembly. This manual low-level coding istime-consuming and complicates to unlock the full GPU potential,preventing experimentation to achieve even higher performance.In this paper we introduce Fireiron, a scheduling language aimedat performance experts. Fireiron provides high-level abstractionsfor expressing GPU optimizations that are unavailable to compilerstoday and which so far must be written in assembly. Our innovation is that both computations and data movements are first classconcepts that can be separately mapped to threads, as required forthe efficient use of specialized hardware like Tensor Cores.We evaluate Fireiron on three GPU architectures against expertwritten advanced matrix multiplications. First, we show that Fireiron schedules are able to express the strategies of these implementations requiring about 6× less lines of code. Second, we show thatthe code generated by Fireiron schedules outperforms the fastestimplementations (provided by cuBLAS) by more than 2×.

Author supplied keywords

Cite

CITATION STYLE

APA

Hagedorn, B., Elliott, A. S., Barthels, H., Bodik, R., & Grover, V. (2020). Fireiron: A data-movement-aware scheduling language for GPUs. In Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT (pp. 71–82). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3410463.3414632

Fireiron: A data-movement-aware scheduling language for GPUs

Abstract

Author supplied keywords

Cite

Register to see more suggestions