In this paper we present an efficient dense matrix multiplication algorithm for distributed memory computers with a hypercube topology. The proposed algorithm performs better than all previously proposed algorithms for a wide range of matrix sizes and number of processors, especially for large matrices. We analyze the performance of the algorithms for two types of hypercube architectures, one in which each node can use (to send and receive) at most one communication link at a time and the other in which each node can use all communication links simultaneously.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Gupta, H., & Sadayappan, P. (1994). Communication efficient matrix multiplication on hypercubes. In Proceedings of the 6th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 1994 (pp. 320–329). Association for Computing Machinery, Inc. https://doi.org/10.1145/181014.181434