Fused Multiply Add Block is an important module in high-speed math co-processors and crypto processors. The main contribution of this paper is to reduce the latency. The vital components of Fused Multiply Add (FMA) unit with multi-mode operations are Alignment Shifter, Normalization shifter, Multiplier, Dual Adder by Carry Look Ahead Adder. The major technical challenges in existing FMA architectures are latency and higher precision. In order to reduce the latency, the Multiplier is designed by using reduced complexity Wallace Multiplier and the latency of overall architecture gets reduced up to 15–25 %. In this paper, the total delay of multiplier designed using reduced complexity Wallace Multiplier is found to be 37.673 ns. In order to get higher precision, we design explicitly Alignment Shifter and Normalization Shifter in the FMA unit by using Barrel Shifter as this Alignment Shifter and Normalization Shifter will have less precision, but since replacement of these blocks by Barrel Shifter will result into higher precision and the latency is further reduced by 25–35 % and the total delay of Alignment Shifter and Normalization Shifter using Barrel Shifter is found to be 5.845 ns.
CITATION STYLE
Kakde, S., Mahindra, M., Khobragade, A., & Shah, N. (2015). FPGA implementation of 128-bit fused multiply add unit for crypto processors. In Communications in Computer and Information Science (Vol. 536, pp. 78–85). Springer Verlag. https://doi.org/10.1007/978-3-319-22915-7_8
Mendeley helps you to discover research relevant for your work.