We present a novel full hardware implementation of Streamlined NTRU Prime, with two variants: a high-speed, high-area implementation and a slower, low-area implementation. We introduce several new techniques that improve performance, including a batch inversion for key generation, a high-speed schoolbook polynomial multiplier, an NTT polynomial multiplier combined with a CRT map, a new DSP-free modular reduction method, a high-speed radix sorting module, and new encoders and decoders. With the high-speed design, we achieve the to-date fastest speeds for Streamlined NTRU Prime, with speeds of 5007, 10,989, and 64,026 cycles for encapsulation, decapsulation, and key generation, respectively, while running at 285 MHz on a Xilinx Zynq Ultrascale+. The entire design uses 40,060 LUT, 26,384 flip-flops, 36.5 Bram, and 31 DSP.
CITATION STYLE
Peng, B. Y., Marotzke, A., Tsai, M. H., Yang, B. Y., & Chen, H. L. (2023). Streamlined NTRU Prime on FPGA. Journal of Cryptographic Engineering, 13(2), 167–186. https://doi.org/10.1007/s13389-022-00303-z
Mendeley helps you to discover research relevant for your work.