Amortised deep parameter optimisation of GPGPU work group size for OpenCV

Jeongju Sohn; Seongmin Lee; Shin Yoo

Conference Proceedings

Amortised deep parameter optimisation of GPGPU work group size for OpenCV

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9962 LNCS 211-217

DOI: 10.1007/978-3-319-47106-8_14

6Citations

12Readers

Get full text

Abstract

GPGPU (General Purpose computing on Graphics Processing Units) enables massive parallelism by taking advantage of the Single Instruction Multiple Data (SIMD) architecture of the large number of cores found on modern graphics cards. A parameter called local work group size controls how many work items are concurrently executed on a single compute unit. Though critical to the performance, there is no deterministic way to tune it, leaving developers to manual trial and error. This paper applies amortised optimisation to determine the best local work group size for GPGPU implementations of OpenCV template matching feature. The empirical evaluation shows that optimised local work group size can outperform the default value with large effect sizes.

Cite

CITATION STYLE

APA

Sohn, J., Lee, S., & Yoo, S. (2016). Amortised deep parameter optimisation of GPGPU work group size for OpenCV. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9962 LNCS, pp. 211–217). Springer Verlag. https://doi.org/10.1007/978-3-319-47106-8_14

Amortised deep parameter optimisation of GPGPU work group size for OpenCV

Abstract

Cite

Register to see more suggestions