How to maximize software performance of symmetric primitives on pentium III and 4 processors

21Citations
Citations of this article
31Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper discusses the state-of-the-art software optimization methodology for symmetric cryptographic primitives on Pentium III and 4 processors. We aim at maximizing speed by considering the internal pipeline architecture of these processors. This is the first paper studying an optimization of ciphers on Prescott, a new core of Pentium 4. Our AES program with 128-bit key achieves 251 cycles/block on Pentium 4, which is, to our best knowledge, the fastest implementation of AES on Pentium 4. We also optimize SNOW2.0 keystream generator. Our program of SNOW2.0 for Pentium III runs at the rate of 2.75 μops/cycle, which seems the most efficient code ever made for a real-world cipher primitive. For FOX128 block cipher, we propose a technique for speeding-up by interleaving two independent blocks using a register group separation. Finally we consider fast implementation of SHA512 and Whirlpool, two hash functions with a genuine 64-bit architecture. It will be shown that new SIMD instruction sets introduced in Pentium 4 excellently contribute to fast hashing of SHA512. © International Association for Cryptologic Research 2005.

Cite

CITATION STYLE

APA

Matsui, M., & Fukuda, S. (2005). How to maximize software performance of symmetric primitives on pentium III and 4 processors. In Lecture Notes in Computer Science (Vol. 3557, pp. 398–412). Springer Verlag. https://doi.org/10.1007/11502760_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free