Nested parallelism: Allocation of threads to tasks and openMP implementation

11Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this paper we discuss the use of nested parallelism. Our claim is that if the problem naturally possesses multiple levels of parallelism, then applying parallelism to all levels may significantly enhance the scalability of your algorithm. This claim is sustained by numerical experiments. We also discuss how to implement multi-level parallelism using OpenMP. We find current OpenMP implementation, based on version 1.0, to have severe limitation for implementing nested parallelization. We then show how this can be circumvented by explicitly assign task to threads. Load balancing issues become more complicated with two (or more) levels of parallelism. To handle this problem, we have designed a distribution algorithm which groups threads into teams, each team being responsible for one course grain outer-level task. This algorithm is proven to produce the optimal load balance, under given assumptions.

Cite

CITATION STYLE

APA

Blikberg, R., & Sørevik, T. (2001). Nested parallelism: Allocation of threads to tasks and openMP implementation. Scientific Programming, 9(2–3), 185–194. https://doi.org/10.1155/2001/821575

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free