Optimization of an OpenCL-based multi-swarm PSO algorithm on an APU

7Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The multi-swarm particle swarm optimization (MPSO) algorithm incorporates multiple independent PSO swarms that cooperate by periodically exchanging information. In spite of its embarrassingly parallel nature, MPSO is memory bound, limiting its performance on data-parallel GPUs. Recently, heterogeneous multi-core architectures such as the AMD Accelerated Processing Unit (APU) have fused the CPU and GPU together on a single die, eliminating the traditional PCIe bottleneck between them. In this paper, we provide our experiences developing an OpenCL-based MPSO algorithm for the task scheduling problem on the APU architecture. We use the AMD A8-3530MX APU that packs four x86 computing cores and 80 four-way processing elements. We make effective use of hardware features such as the hierarchical memory structure on the APU, the four-way very long instruction word (VLIW) feature for vectorization, and global-to-local memory DMA transfers. We observe a 29 % decrease in overall execution time over our baseline implementation. © 2014 Springer-Verlag.

Cite

CITATION STYLE

APA

Franz, W., Thulasiraman, P., & Thulasiram, R. K. (2014). Optimization of an OpenCL-based multi-swarm PSO algorithm on an APU. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8385 LNCS, pp. 140–150). Springer Verlag. https://doi.org/10.1007/978-3-642-55195-6_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free