Vector-parallel algorithms for 1-dimensional fast fourier transform

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We review 1-dimensional FFT algorithms for distributed-memory machines with vector processing nodes. To attain high performance on this type of machine, one has to achieve both high single-processor performance and high parallel efficiency at the same time. We explain a general framework for designing 1-D FFT based on a 3-dimensional representation of the data that can satisfy both of these requirements. Among many algorithms derived from this framework, two variants are shown to be optimal from the viewpoint of both parallel performance and usability. We also introduce several ideas that further improve performance and flexibility of user interface. Numerical experiments on the Hitachi SR2201, a distributed-memory parallel machine with pseudo-vector processing nodes, show that our program can attain 48% of the peak performance when computing the FFT of 226 points using 64 nodes. © 2005 Springer Science+Business Media, Inc.

Cite

CITATION STYLE

APA

Yamamoto, Y., Kawamura, H., & Igai, M. (2005). Vector-parallel algorithms for 1-dimensional fast fourier transform. In New Horizons of Parallel and Distributed Computing (pp. 53–66). Springer US. https://doi.org/10.1007/0-387-28967-4_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free