Towards unifying OpenMP under the task-parallel paradigm: Implementation and performance of the taskloop construct

Artur Podobas; Sven Karlsson

Journal Article

Towards unifying OpenMP under the task-parallel paradigm: Implementation and performance of the taskloop construct

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9903 LNCS 116-129

DOI: 10.1007/978-3-319-45550-1_9

6Citations

3Readers

Get full text

Abstract

OpenMP 4.5 introduced a task-parallel version of the classical thread-parallel for-loop construct: the taskloop construct. With this new construct, programmers are given the opportunity to choose between the two parallel paradigms to parallelize their for loops. However, it is unclear where and when the two approaches should be used when writing efficient parallel applications. In this paper, we explore the taskloop construct. We study performance differences between traditional thread-parallel for loops and the new taskloop directive. We introduce an efficient implementation and compare our implementation to other taskloop implementations using micro- and kernel-benchmarks, as well as an application. We show that our taskloop implementation on average results in a 3.2% increase in peak performance when compared against corresponding parallel-for loops.

Cite

CITATION STYLE

APA

Podobas, A., & Karlsson, S. (2016). Towards unifying OpenMP under the task-parallel paradigm: Implementation and performance of the taskloop construct. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9903 LNCS, 116–129. https://doi.org/10.1007/978-3-319-45550-1_9

Towards unifying OpenMP under the task-parallel paradigm: Implementation and performance of the taskloop construct

Abstract

Cite

Register to see more suggestions