An autotuning framework for scalable execution of tiled code via iterative polyhedral compilation

Yukinori Sato; Tomoya Yuki; Toshio Endo

Journal ArticleOPEN ACCESS

An autotuning framework for scalable execution of tiled code via iterative polyhedral compilation

ACM Transactions on Architecture and Code Optimization (2019) 15(4)

DOI: 10.1145/3293449

11Citations

18Readers

Abstract

On modern many-core CPUs, performance tuning against complex memory subsystems and scalability for parallelism is mandatory to achieve their potential. In this article, we focus on loop tiling, which plays an important role in performance tuning, and develop a novel framework that analytically models the load balance and empirically autotunes unpredictable cache behaviors through iterative polyhedral compilation using LLVM/Polly. From an evaluation on many-core CPUs, we demonstrate that our autotuner achieves a performance superior to those that use conventional static approaches andwell-known autotuning heuristics. Moreover, our autotuner achieves almost the same performance as a brute-force search-based approach.

Author supplied keywords

Cite

CITATION STYLE

APA

Sato, Y., Yuki, T., & Endo, T. (2019). An autotuning framework for scalable execution of tiled code via iterative polyhedral compilation. ACM Transactions on Architecture and Code Optimization, 15(4). https://doi.org/10.1145/3293449

An autotuning framework for scalable execution of tiled code via iterative polyhedral compilation

Abstract

Author supplied keywords

Cite

Register to see more suggestions