Montgomery multiplication on the cell

8Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A technique to speed up Montgomery multiplication targeted at the Synergistic Processor Elements (SPE) of the Cell Broadband Engine is proposed. The technique consists of splitting a number into four consecutive parts. These parts are placed one by one in each of the four element positions of a vector, representing columns in a 4-SIMD organization. This representation enables arithmetic to be performed in a 4-SIMD fashion. An implementation of the Montgomery multiplication using this technique is up to 2.47 times faster compared to an unrolled implementation of Montgomery multiplication, which is part of the IBM multi-precision math library, for odd moduli of length 160 to 2048 bits. The presented technique can also be applied to speed up Montgomery multiplication on other SIMD-architectures. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Bos, J. W., & Kaihara, M. E. (2010). Montgomery multiplication on the cell. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6067 LNCS, pp. 477–485). https://doi.org/10.1007/978-3-642-14390-8_50

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free