Optimization and analysis of parallel back propagation neural network on GPU using CUDA

Yaobin Wang; Pingping Tang; Hong An; Zhiqin Liu; Kun Wang; Yong Zhou

Conference Proceedings

Optimization and analysis of parallel back propagation neural network on GPU using CUDA

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9491 156-163

DOI: 10.1007/978-3-319-26555-1_18

2Citations

2Readers

Get full text

Abstract

Graphic Processing Unit (GPU) can achieve remarkable performance for dataset-oriented application such as Back Propagation Network (BPN) under reasonable task decomposition and memory optimization. However, advantages of GPU’s memory architecture are still not fully exploited to parallel BPN. In this paper, we develop and analyze a parallel implementation of a back propagation neural network using CUDA. It focuses on kernels optimization through the use of shared memory and suitable blocks dimensions. The implementation was tested with seven well-known benchmark data sets and the results show promising 33.8x to 64.3x speedups can be realized compared to a sequential implementation on a CPU.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, Y., Tang, P., An, H., Liu, Z., Wang, K., & Zhou, Y. (2015). Optimization and analysis of parallel back propagation neural network on GPU using CUDA. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9491, pp. 156–163). Springer Verlag. https://doi.org/10.1007/978-3-319-26555-1_18

Optimization and analysis of parallel back propagation neural network on GPU using CUDA

Abstract

Author supplied keywords

Cite

Register to see more suggestions