Implementation and optimization of three-dimensional UPML-FDTD algorithm on GPU clusters

Lei Xu; Ying Xu

Conference Proceedings

Implementation and optimization of three-dimensional UPML-FDTD algorithm on GPU clusters

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8488 LNCS 478-486

DOI: 10.1007/978-3-319-07518-1_33

2Citations

2Readers

Get full text

Abstract

Co-processors with powerful floating-point operation capability have been used to study the electromagnetic simulations using the Finite Difference Time Domain (FDTD) method. This work focuses on the implementation and optimization of 3D UPML-FDTD parallel algorithm on GPU clusters. A set of techniques are utilized to optimize the FDTD algorithm, such as the application of GPU texture memory, asynchronization of data transfer between CPU and GPU. The performance of the parallel FDTD algorithm is tested on K20m GPU clusters. The scalability of the algorithm is tested for up to 80 NVIDIA Tesla K20m GPUs with the parallel efficiency up to 95%, and the optimization techniques explored in this study are found to improve the performance. © 2014 Springer International Publishing.

Cite

CITATION STYLE

APA

Xu, L., & Xu, Y. (2014). Implementation and optimization of three-dimensional UPML-FDTD algorithm on GPU clusters. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8488 LNCS, pp. 478–486). Springer Verlag. https://doi.org/10.1007/978-3-319-07518-1_33

Implementation and optimization of three-dimensional UPML-FDTD algorithm on GPU clusters

Abstract

Cite

Register to see more suggestions