Work-stealing schedulers are common in shared memory environments. However, large scale distributed memory usage has been limited to specific ad-hoc implementations preventing a broader adoption. In this paper we introduce a new scalable work-stealing algorithm for distributed memory systems as well as our implementation as the TITUS_DLB library. It is based on Kleinberg’s small-world graph. It allows to control the communication patterns and associated runtime overheads while providing efficient heuristics for victim selection and results routing. To validate our approach, we present the DLB_Bench benchmark which emulates arbitrary workload distribution and imbalance characteristics. Finally, we compare TITUS_DLB to the ad-hoc solution developed for the YALES2 computational fluid dynamics and combustion solver. We achieve up to 54% performance gain over thousands of cores.
CITATION STYLE
Fontenaille, C., Petit, E., de Oliveira Castro, P., Uemura, S., Sohier, D., Lesnicki, P., … Moureau, V. (2019). Scalable work-stealing load-balancer for HPC distributed memory systems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11339 LNCS, pp. 146–158). Springer Verlag. https://doi.org/10.1007/978-3-030-10549-5_12
Mendeley helps you to discover research relevant for your work.