An OpenMP implementation of parallel FFT and its performance on IA-64 processors

7Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we propose an OpenMP implementation of a recursive algorithm for parallel fast Fourier transform (FFT) on shared-memory parallel computers. A recursive three-step FFT algorithm improves performance by effectively utilizing the cache memory. Performance results of one-dimensional FFTs on the DELL PowerEdge 7150 and the hp workstation zx600 are reported. We successfully achieved performance of about 757 MFLOPS on the DELL PowerEdge 7150 (Itanium 800 MHz, 4 CPUs) and about 871 MFLOPS on the hp workstation zx6000 (Itanium2 1 GHz, 2 CPUs) for 224-point FFT. © Springer-Verlag Berlin Heidelberg 2003.

Cite

CITATION STYLE

APA

Takahashi, D., Sato, M., & Boku, T. (2003). An OpenMP implementation of parallel FFT and its performance on IA-64 processors. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2716, 99–108. https://doi.org/10.1007/3-540-45009-2_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free