Identifying and (automatically) remedying performance problems in CPU/GPU applications

Benjamin Welton; Barton P. Miller

Conference ProceedingsOPEN ACCESS

Identifying and (automatically) remedying performance problems in CPU/GPU applications

Proceedings of the International Conference on Supercomputing (2020)

DOI: 10.1145/3392717.3392759

2Citations

10Readers

Get full text

Abstract

GPU accelerators have become common on today's leadership-class computing platforms. Effective exploitation of the additional parallelism offered by GPUs is fraught with challenges. A key performance challenge faced by developers is how to limit the time consumed by synchronizations between the CPU and GPU. We introduce the extended feed-forward measurement (FFM) performance tool that provides an automated detection of synchronization problems, identifies if the synchronization problem is a component of a larger construct that exhibits a problem beyond an individual synchronization operation, identifies remedies that can correct the issue, and in some cases automatically applies remedies to problems exhibited by larger constructs. The extended FFM performance tool identifies three causes of unnecessary synchronizations: a problem caused by a single operation, a problem caused by memory management issues, and a problem caused by a memory transfer. The extended FFM model prescribes remedies for each construct and can automatically apply remedies for memory management and memory transfer cause problems. We created an implementation of the extended FFM performance tool and employed it to identify and automatically correct problems in three real-world scientific applications, resulting in an automatically obtained reduction in execution time between 9% and 43%.

Author supplied keywords

Cite

CITATION STYLE

APA

Welton, B., & Miller, B. P. (2020). Identifying and (automatically) remedying performance problems in CPU/GPU applications. In Proceedings of the International Conference on Supercomputing. Association for Computing Machinery. https://doi.org/10.1145/3392717.3392759

Identifying and (automatically) remedying performance problems in CPU/GPU applications

Abstract

Author supplied keywords

Cite

Register to see more suggestions