In this paper, we propose an OpenMP implementation of a recursive algorithm for parallel fast Fourier transform (FFT) on shared-memory parallel computers. A recursive three-step FFT algorithm improves performance by effectively utilizing the cache memory. Performance results of one-dimensional FFTs on the DELL PowerEdge 7150 and the hp workstation zx600 are reported. We successfully achieved performance of about 757 MFLOPS on the DELL PowerEdge 7150 (Itanium 800 MHz, 4 CPUs) and about 871 MFLOPS on the hp workstation zx6000 (Itanium2 1 GHz, 2 CPUs) for 224-point FFT. © Springer-Verlag Berlin Heidelberg 2003.
CITATION STYLE
Takahashi, D., Sato, M., & Boku, T. (2003). An OpenMP implementation of parallel FFT and its performance on IA-64 processors. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2716, 99–108. https://doi.org/10.1007/3-540-45009-2_8
Mendeley helps you to discover research relevant for your work.