In this work we present a highly efficient implementation of OpenMP tasks. It is based on a runtime infrastructure architected for data locality, a crucial prerequisite for exploiting the NUMA nature of modern multicore multiprocessors. In addition, we employ fast work-stealing structures, based on a novel, efficient and fair blocking algorithm. Synthetic benchmarks show up to a 6-fold increase in throughput (tasks completed per second), while for a task-based OpenMP application suite we measured up to 87% reduction in execution times, as compared to other OpenMP implementations. © 2012 Springer-Verlag.
CITATION STYLE
Agathos, S. N., Kallimanis, N. D., & Dimakopoulos, V. V. (2012). Speeding up OpenMP tasking. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7484 LNCS, pp. 650–661). https://doi.org/10.1007/978-3-642-32820-6_64
Mendeley helps you to discover research relevant for your work.