Patterns of inefficient performance behavior in GPU applications

2Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Writing efficient software for heterogeneous architectures equipped with modern accelerator devices presents a serious challenge to programmer productivity, creating a need for powerful performance-analysis tools to adequately support the software development process. To guide the design of such tools, we describe typical patterns of inefficient runtime behavior that may adversely affect the performance of applications that use general-purpose processors along with GPU devices through a CUDA compute engine. To evaluate the general impact of these patterns on application performance, we further present a micro benchmark suite that allows the performance penalty of each pattern to be quantified with results obtained on NVIDIA Fermi and Tesla architectures, indeed demonstrating significant delays. Furthermore this suite can be used as a default test scenario to add CUDA support to performance-analysis tools used in high-performance computing. © 2011 IEEE.

Author supplied keywords

Cite

CITATION STYLE

APA

Eschweiler, D., Becker, D., & Wolf, F. (2011). Patterns of inefficient performance behavior in GPU applications. In Proceedings - 19th International Euromicro Conference on Parallel, Distributed, and Network-Based Processing, PDP 2011 (pp. 262–266). https://doi.org/10.1109/PDP.2011.84

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free