Current applications in image and media processing, scientific and engineering computing require a tremendous processing and higher memory bandwidth to gain high performance. Three dimensional multi/manycore processors stacked with memory layer(s) may provide good processing facilities to enhance the performance of these applications. In this paper, we introduce a proposal of a 3-D stacked many-core processor architecture composing of a number of processing elements (PEs) layers stacked with one or more memory layer shared among all PEs. Unlike many 3-D machine architectures, the proposed model uses local communications between PEs in both horizontal and vertical links avoiding the cost of building specialized interconnection networks. We present a novel memory efficient SPMD blocked algorithm for performing the kernel matrix-matrix multiply operation (MMM), on the 3D processor architecture. Our analytical evaluation of the 3-D stacked architecture showed a near linear speedup as the number of PE layers increases while data communication and redistribution is overlapped with computing. © 2013 Springer Science+Business Media.
CITATION STYLE
Zekri, A. S. (2013). Three dimensional SPMD matrix-matrix multiplication algorithm and a stacked many-core processor architecture. In Lecture Notes in Electrical Engineering (Vol. 152 LNEE, pp. 1139–1150). https://doi.org/10.1007/978-1-4614-3535-8_94
Mendeley helps you to discover research relevant for your work.