SIMD vectorization of straight line FFT code

Stefan Kral; Franz Franchetti; Juergen Lorenz; Christoph W. Ueberhuber

Journal ArticleOPEN ACCESS

SIMD vectorization of straight line FFT code

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 2790 251-260

DOI: 10.1007/978-3-540-45209-6_39

18Citations

13Readers

Abstract

This paper presents compiler technology that targets general purpose microprocessors augmented with SIMD execution units for exploiting data level parallelism. FFT kernels are accelerated by automatically vectorizing blocks of straight line code for processors featuring two-way short vector SIMD extensions like AMD's 3DNow! and Intel's SSE 2. Additionally, a special compiler backend is introduced which is able to (i) utilize particular code properties, (ii) generate optimized address computation, and (iii) apply specialized register allocation and instruction scheduling. Experiments show that automatic SIMD vectorization can achieve performance that is comparable to the optimal hand-generated code for FFT kernels. The newly developed methods have been integrated into the codelet generator of FFTW and successfully vectorized complicated code like real-to-halfcomplex non-power-of-two FFT kernels. The floatingpoint performance of FFTW'S scalar version has been more than doubled, resulting in the fastest FFT implementation to date. © Springer-Verlag 2003.

Cite

CITATION STYLE

APA

Kral, S., Franchetti, F., Lorenz, J., & Ueberhuber, C. W. (2004). SIMD vectorization of straight line FFT code. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2790, 251–260. https://doi.org/10.1007/978-3-540-45209-6_39

SIMD vectorization of straight line FFT code

Abstract

Cite

Register to see more suggestions