Spherical harmonic transform with GPUs

10Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, s 2 hat. We focus our attention on two major sequential steps involved in the transforms computation retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We present performance comparisons of a single CPU plus GPU unit with the s 2 hat code running on either a single or 4 processors. In particular, we find that the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to s 2 hat executed on one core, and by as much as 5.5 with respect to s 2 hat on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability. © 2012 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Hupca, I. O., Falcou, J., Grigori, L., & Stompor, R. (2012). Spherical harmonic transform with GPUs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7155 LNCS, pp. 355–366). Springer Verlag. https://doi.org/10.1007/978-3-642-29737-3_40

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free