Montgomery multiplication using vector instructions

28Citations
Citations of this article
32Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper we present a parallel approach to compute interleaved Montgomery multiplication. This approach is particularly suitable to be computed on 2-way single instruction, multiple data platforms as can be found on most modern computer architectures in the form of vector instruction set extensions. We have implemented this approach for tablet devices which run the x86 architecture (Intel Atom Z2760) using SSE2 instructions as well as devices which run on the ARM platform (Qualcomm MSM8960, NVIDIA Tegra 3 and 4) using NEON instructions. When instantiating modular exponentiation with this parallel version of Montgomery multiplication we observed a performance increase of more than a factor of 1.5 compared to the sequential implementation in OpenSSL for the classical arithmetic logic unit on the Atom platform for 2048-bit moduli. © 2014 Springer-Verlag.

Cite

CITATION STYLE

APA

Bos, J. W., Montgomery, P. L., Shumow, D., & Zaverucha, G. M. (2014). Montgomery multiplication using vector instructions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8282 LNCS, pp. 471–489). Springer Verlag. https://doi.org/10.1007/978-3-662-43414-7_24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free