Abstract
This paper discusses the state-of-the-art software optimization methodology for symmetric cryptographic primitives on Pentium III and 4 processors. We aim at maximizing speed by considering the internal pipeline architecture of these processors. This is the first paper studying an optimization of ciphers on Prescott, a new core of Pentium 4. Our AES program with 128-bit key achieves 251 cycles/block on Pentium 4, which is, to our best knowledge, the fastest implementation of AES on Pentium 4. We also optimize SNOW2.0 keystream generator. Our program of SNOW2.0 for Pentium III runs at the rate of 2.75 μops/cycle, which seems the most efficient code ever made for a real-world cipher primitive. For FOX128 block cipher, we propose a technique for speeding-up by interleaving two independent blocks using a register group separation. Finally we consider fast implementation of SHA512 and Whirlpool, two hash functions with a genuine 64-bit architecture. It will be shown that new SIMD instruction sets introduced in Pentium 4 excellently contribute to fast hashing of SHA512. © International Association for Cryptologic Research 2005.
Cite
CITATION STYLE
Matsui, M., & Fukuda, S. (2005). How to maximize software performance of symmetric primitives on pentium III and 4 processors. In Lecture Notes in Computer Science (Vol. 3557, pp. 398–412). Springer Verlag. https://doi.org/10.1007/11502760_27
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.