We introduce a well-optimized implementation of PSO algorithm based on, Compute Unified Device Architecture (CUDA), using global neighborhood topology with extremely large swarms (greater than 1000 particles). The algorithm optimization is based on effective data organization in GPU memory such as transfer and thread optimization, pinned memory and the zero-copy mechanism usage. Experimental results show that the implementation on GPU is significantly faster than implementation on CPU.
CITATION STYLE
Ko̷lodziejczyk, J., Sychel, D., & Bera, A. (2017). Improved CUDA PSO based on global topology. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10245 LNAI, pp. 347–358). Springer Verlag. https://doi.org/10.1007/978-3-319-59063-9_31
Mendeley helps you to discover research relevant for your work.