MPI is a message-passing standard widely used for developing high-performance parallel applications. Because of the restriction in the MPI computation model, conventional implementations on shared memory machines map each MPI node to an OS process, which suffers serious performance degradation in the presence of multiprogramming, especially when a space/time sharing policy is employed in OS job scheduling. In this paper, we study compile-time and run-time support for MPI by using threads and demonstrate our optimization techniques for executing a large class of MPI programs written in C. The compile-time transformation adopts thread-specific data structures to eliminate the use of global and static variables in C code. The run-time support includes an efficient point-to-point communication protocol based on a novel lock-free queue management scheme. Our experiments on an SGI Origin 2000 show that our MPI prototype called TMPI using the proposed techniques is competitive with SGI's native MPI implementation in a dedicated environment, and it has significant performance advantages with up to a 23-fold improvement in a multiprogrammed environment. © 1999 ACM.
CITATION STYLE
Tang, H., Shen, K., & Yang, T. (1999). Compile/run-time support for threaded MPI execution on multiprogrammed shared memory machines. SIGPLAN Notices (ACM Special Interest Group on Programming Languages), 34(8), 107–118. https://doi.org/10.1145/329366.301114
Mendeley helps you to discover research relevant for your work.