An implementation of parallel 1-D FFT using SSE3 instructions on dual-core processors

9Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In the present paper, an implementation of a parallel one-dimensional fast Fourier transform (FFT) using Streaming SIMD Extensions 3 (SSE3) instructions on dual-core processors is proposed. Combination of vectorization and the block six-step FFT algorithm is shown to effectively improve performance. The performance results for one-dimensional FFTs on dual-core Intel Xeon processors are reported. We successfully achieved performance of approximately 2006 MFLOPS on a dual-core Intel Xeon PC (2.8 GHz, two CPUs, four cores) and approximately 3492 MFLOPS on a dual-core Intel Xeon 5150 PC (2.66 GHz, two CPUs, four cores) for a 220-point FFT. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Takahashi, D. (2007). An implementation of parallel 1-D FFT using SSE3 instructions on dual-core processors. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4699 LNCS, pp. 1178–1187). Springer Verlag. https://doi.org/10.1007/978-3-540-75755-9_135

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free