Implementation and optimization of three-dimensional UPML-FDTD algorithm on GPU clusters

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Co-processors with powerful floating-point operation capability have been used to study the electromagnetic simulations using the Finite Difference Time Domain (FDTD) method. This work focuses on the implementation and optimization of 3D UPML-FDTD parallel algorithm on GPU clusters. A set of techniques are utilized to optimize the FDTD algorithm, such as the application of GPU texture memory, asynchronization of data transfer between CPU and GPU. The performance of the parallel FDTD algorithm is tested on K20m GPU clusters. The scalability of the algorithm is tested for up to 80 NVIDIA Tesla K20m GPUs with the parallel efficiency up to 95%, and the optimization techniques explored in this study are found to improve the performance. © 2014 Springer International Publishing.

Cite

CITATION STYLE

APA

Xu, L., & Xu, Y. (2014). Implementation and optimization of three-dimensional UPML-FDTD algorithm on GPU clusters. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8488 LNCS, pp. 478–486). Springer Verlag. https://doi.org/10.1007/978-3-319-07518-1_33

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free