Abstract
Parallel programs incur overhead in many different ways, such as synchronization, load imbalance, communication, and insufficient parallelism. We have found that all of these categories are important in understanding the performance of parallel programs, and that a rapid assessment of how processing time is spent in each of these categories is extremely helpful in the performance tuning of parallel programs. As a result we have developed the notion of performance predicates, which are expressions that define these categories and can be used to recognize and classify inefficient states during a program's execution. Formal definition allows us to discuss the categories quantitatively; we present a method for measuring time spent in each cate-gory, based on the common metric of lost cycles. The method we describe, called predicate profiling, is shown to be quite useful for both application- level and program-level performance tuning. We show that predicate profiling is relatively easy to implement, and has very low run-Time cost. We also show that the lost cycles metric is applicable to programs for which other metrics, like speedup, aren't well defined.
Cite
CITATION STYLE
Crovella, M. E., & LeBlanc, T. J. (1993). Performance debugging using parallel performance predicates. In Proceedings of the 1993 ACM/ONR Workshop on Parallel and Distributed Debugging, PADD 1993 (pp. 140–150). Association for Computing Machinery, Inc. https://doi.org/10.1145/174266.171276
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.