Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis from a portfolio of advanced scheduling algorithms (Korndörfer et al., 2022). This selection approach is time-consuming, challenging, and may need to change during execution. We propose Auto4OMP, a novel approach for automated load balancing of OpenMP applications. With Auto4OMP, we introduce three scheduling algorithm selection methods and an expert-defined chunk parameter for OpenMP's schedule clause's kind and chunk, respectively. Auto4OMP extends the OpenMP schedule(auto) and chunk parameter implementation in LLVM's OpenMP runtime library to automatically select a scheduling algorithm and calculate a chunk parameter during execution. Loop characteristics are inferred in Auto4OMP from the loop execution over the application's time-steps. The experiments performed in this work show that Auto4OMP improves applications performance by up to 11% compared to LLVM's schedule(auto) implementation and outperforms manual selection. Auto4OMP improves MPI+OpenMP applications performance by explicitly minimizing thread-and implicitly reducing process-load imbalance.
CITATION STYLE
Mohammed, A., Korndörfer, J. H. M., Eleliemy, A., & Ciorba, F. M. (2022). Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP. IEEE Transactions on Parallel and Distributed Systems, 33(12), 4383–4394. https://doi.org/10.1109/TPDS.2022.3189270
Mendeley helps you to discover research relevant for your work.