Flexer: Out-of-Order Scheduling for Multi-NPUs

Hyemi Min; Jungyoon Kwon; Bernhard Egger

Conference ProceedingsOPEN ACCESS

Flexer: Out-of-Order Scheduling for Multi-NPUs

CGO 2023 - Proceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization (2023) 212-223

DOI: 10.1145/3579990.3580025

7Citations

4Readers

Get full text

Abstract

Recent neural accelerators often comprise multiple neural processing units (NPUs) with shared cache and memory. The regular schedules of state-of-the-art scheduling techniques miss important opportunities for memory reuse. This paper presents Flexer, an out-of-order (OoO) scheduler that maximizes instruction-level parallelism and data reuse on such multi-NPU systems. Flexer employs a list scheduling algorithm to dynamically schedule the tiled workload to all NPUs. To cope with the irregular data access patterns of OoO schedules, several heuristics help maximize data reuse by considering the availability of data tiles at different levels in the memory hierarchy. Evaluated with several neural networks on 2 to 4-core multi-NPUs, Flexer achieves a speedup of up to 2.2x and a 1.2-fold reduction in data transfers for individual layers compared to the best static execution order.

Author supplied keywords

Cite

CITATION STYLE

APA

Min, H., Kwon, J., & Egger, B. (2023). Flexer: Out-of-Order Scheduling for Multi-NPUs. In CGO 2023 - Proceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization (pp. 212–223). Association for Computing Machinery, Inc. https://doi.org/10.1145/3579990.3580025

Flexer: Out-of-Order Scheduling for Multi-NPUs

Abstract

Author supplied keywords

Cite

Register to see more suggestions