In the present paper, an implementation of a parallel one-dimensional fast Fourier transform (FFT) using Streaming SIMD Extensions 3 (SSE3) instructions on dual-core processors is proposed. Combination of vectorization and the block six-step FFT algorithm is shown to effectively improve performance. The performance results for one-dimensional FFTs on dual-core Intel Xeon processors are reported. We successfully achieved performance of approximately 2006 MFLOPS on a dual-core Intel Xeon PC (2.8 GHz, two CPUs, four cores) and approximately 3492 MFLOPS on a dual-core Intel Xeon 5150 PC (2.66 GHz, two CPUs, four cores) for a 220-point FFT. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Takahashi, D. (2007). An implementation of parallel 1-D FFT using SSE3 instructions on dual-core processors. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4699 LNCS, pp. 1178–1187). Springer Verlag. https://doi.org/10.1007/978-3-540-75755-9_135
Mendeley helps you to discover research relevant for your work.