Abstract
The Cell Broadband Engine™ is a heterogeneous multi-core architecture developed by IBM, Sony and Toshiba. It has eight computation intensive cores (SPEs) with a small local memory, and a single PowerPC core. The SPEs have a total peak single precision performance of 204.8 Gflops/s, and 14.64 Gflops/s in double precision. Therefore, the Cell has a good potential for high performance computing. But the unconventional architecture makes it difficult to program. We propose an implementation of the core features of MPI as a solution to this problem. This can enable a large class of existing applications to be ported to the Cell. Our MPI implementation attains bandwidth up to 6.01 GB/s, and latency as small as 0.41 μs. The significance of our work is in demonstrating the effectiveness of intra-Cell MPI, consequently enabling the porting of MPI applications to the Cell with minimal effort. © Springer-Verlag Berlin Heidelberg 2007.
Author supplied keywords
Cite
CITATION STYLE
Kumar, A., Senthilkumar, G., Krishna, M., Jayam, N., Baruah, P. K., Sharma, R., … Kapoor, S. (2007). A buffered-mode MPI implementation for the cell BETM processor. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4487 LNCS, pp. 603–610). Springer Verlag. https://doi.org/10.1007/978-3-540-72584-8_80
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.