Abstract
Many low-level optimizations for NVIDIA GPU can only be implemented in native hardware assembly (SASS). However, programming in SASS is unproductive and not portable. To simplify low-level GPU programming, we present GAS (Gpu ASsembly), a PTX-like language that provides a stable instruction set across hardware architectures while giving programmers a low-level control of code execution. We demonstrate that GAS can be used with ease for low-level benchmarking and performance tuning in the context of Tensor Core HGEMM.
Cite
CITATION STYLE
Yan, D., Wang, W., & Chu, X. (2021). Simplifying low-level GPU programming with GAS. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP (pp. 469–471). Association for Computing Machinery. https://doi.org/10.1145/3437801.3441591
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.