Collective communication performance is critical in a number of MPI applications, yet relatively few results are available to assess the performance of mainstream MPI implementations. In this paper we focus on two widely used primitives, broadcast and reduce, and present experimental results for the Cray T3E and the IBM SP2. We compare the performance of the existing MPI primitives with our implementation based on a new algorithm. Our tests show that existing all-software implementations can be improved and highlight the advantages of the Cray hardware-assisted implementation.
CITATION STYLE
Bernaschi, M., Iannello, G., & Lauria, M. (1999). Experimental results about MPI collective communication operations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1593, pp. 774–783). Springer Verlag. https://doi.org/10.1007/bfb0100638
Mendeley helps you to discover research relevant for your work.