LEA is a new lightweight and low-power encryption algorithm. This algorithm has a certain useful features which are especially suitable for parallel hardware and software implementations, i.e., simple ARX operations, non-S-BOX architecture, and 32-bit word size. In this paper we evaluate the performance of the LEA algorithm on ARMNEON and GPUs by taking advantage of both the desirable features of LEA and a parallel computing platform and programming model by NEON and CUDA. Specifically, we propose novel parallel LEA implementations on representative SIMT and SIMD architectures such as CUDA and NEON. In case of CUDA, we firstly designed a threadbased computation model to fall into functional parallelism by computing several encryptions over one thread. To alleviate the memory transfer delay, we allocate memory to satisfy coalescing memory access. Secondly our method is block cipher implementation written in assembly language, which provides efficient and flexible programming environments. With these optimization techniques, we achieved 17.352 and 2.5GBps (bytes per second) throughput without/with memory transfer. In case of NEON, we adopted pipeline instructions and SIMD-based execution models, which enhanced encryption by 49.85% compared to previous ARM implementations.
CITATION STYLE
Seo, H., Liu, Z., Park, T., Kim, H., Lee, Y., Choi, J., & Kim, H. (2014). Parallel implementations of LEA. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8565, pp. 256–274). Springer Verlag. https://doi.org/10.1007/978-3-319-12160-4_16
Mendeley helps you to discover research relevant for your work.