Parallel implementations of lea, revisited

14Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper we revisited the parallel implementations of LEA. By taking the advantages of both the light-weight features of LEA and the parallel computation abilities of ARM-NEON platforms, performance is significantly improved. We firstly optimized the implementations on ARM and NEON architectures. For ARM processor, barrel shifter instruction is used to hide the latencies for rotation operations. For NEON engine, the minimum number of NEON registers are assigned to the round key variables by performing the on-time round key loading from ARM registers. This approach reduces the required NEON registers for round key variables by three registers and the registers and temporal registers are used to retain four more plaintext for encryption operation. Furthermore, we finely transform the data into SIMD format by using transpose and swap instructions. The compact ARM and NEON implementations are combined together and computed in mixed processing way. This approach hides the latency of ARM computations into NEON overheads. Finally, multiple cores are fully exploited to perform the maximum throughputs on the target devices. The proposed implementations achieved the fastest LEA encryption within 3.2 cycle/byte for Cortex-A9 processors.

Cite

CITATION STYLE

APA

Seo, H., Park, T., Heo, S., Seo, G., Bae, B., Hu, Z., … Kim, H. (2017). Parallel implementations of lea, revisited. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10144 LNCS, pp. 318–330). Springer Verlag. https://doi.org/10.1007/978-3-319-56549-1_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free