GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory

8Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

Abstract

We present an implementation of the overlap-and-save method, a method for the convolution of very long signals with short response functions, which is tailored to GPUs. We have implemented several FFT algorithms (using the CUDA programming language), which exploit GPU shared memory, allowing for GPU accelerated convolution. We compare our implementation with an implementation of the overlap-and-save algorithm utilizing the NVIDIA FFT library (cuFFT). We demonstrate that by using a shared-memory-based FFT, we can achieved significant speed-ups for certain problem sizes and lower the memory requirements of the overlap-and-save method on GPUs.

Cite

CITATION STYLE

APA

Adámek, K., Dimoudi, S., Giles, M., & Armour, W. (2020). GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory. ACM Transactions on Architecture and Code Optimization, 17(3). https://doi.org/10.1145/3394116

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free