A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation

1Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents a methodology for using LLVM-based tools to tune the DCA++ (dynamical cluster approximation) application that targets the new ARM A64FX processor. The goal is to describe the changes required for the new architecture and generate efficient single instruction/multiple data (SIMD) instructions that target the new Scalable Vector Extension instruction set. During manual tuning, the authors used the LLVM tools to improve code parallelization by using OpenMP SIMD, refactored the code and applied transformation that enabled SIMD optimizations, and ensured that the correct libraries were used to achieve optimal performance. By applying these code changes, code speed was increased by 1.98 × and 78 GFlops were achieved on the A64FX processor. The authors aim to automatize parts of the efforts in the OpenMP Advisor tool, which is built on top of existing and newly introduced LLVM tooling.

Cite

CITATION STYLE

APA

Huber, J., Wei, W., Georgakoudis, G., Doerfert, J., & Hernandez, O. (2021). A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12870 LNCS, pp. 142–155). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-85262-7_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free