Architectural enhancements for montgomery multiplication on embedded RISC processors

21Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Montgomery multiplication normally spends over 90% of its execution time in inner loops executing some kind of multiply-and-add operations. The performance of these critical code sections can be greatly improved by customizing the processor's instruction set for low-level arithmetic functions. In this paper, we investigate the potential of architectural enhancements for multiple-precision Montgomery multiplication according to the so-called Finely Integrated Product Scanning (FIPS) method. We present instruction set extensions to accelerate the FIPS inner loop operation based on the availability of a multiply/accumulate (MAC) unit with a wide accumulator. Finally, we estimate the execution time of a 1024-bit Montgomery multiplication on an extended MIPS32 core and discuss the impact of the multiplier latency. © Springer-Verlag Berlin Heidelberg 2003.

Cite

CITATION STYLE

APA

Großschädl, J., & Kamendje, G. A. (2003). Architectural enhancements for montgomery multiplication on embedded RISC processors. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2846, 418–434. https://doi.org/10.1007/978-3-540-45203-4_32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free