With the shift towards chip multiprocessors (CMPs), ex- ploiting and
managing parallelism has become a central prob- lem in computer systems.
Many issues of parallelism man- agement boil down to discerning which
running threads or processes are critical, or slowest, versus which
are non-critical. If one can accurately predict critical threads
in a parallel program, then one can respond in a variety of ways.
Possibil- ities include running the critical thread at a faster clock
rate, performing load balancing techniques to offload work onto currently
non-critical threads, or giving the critical thread more on-chip
resources to execute faster.
This paper proposes and evaluates simple but effective thread criticality
predictors for parallel applications. We show that accurate predictors
can be built using counters that are typically already available
on-chip. Our predictor, based on memory hierarchy statistics, identifies
thread crit- icality with an average accuracy of 93% across a range
We also demonstrate two applications of our predictor. First, we show
how Intel�s Threading Building Blocks (TBB) parallel runtime system
can benefit from task stealing tech- niques that use our criticality
predictor to reduce load im- balance. Using criticality prediction
to guide TBB�s task- stealing decisions improves performance by
13-32% for TBB- based PARSEC benchmarks running on a 32-core CMP.
As a second application, criticality prediction guides dynamic energy
optimizations in barrier-based applications. By run- ning the predicted
critical thread at the full clock rate and frequency-scaling non-critical
threads, this approach achieves average energy savings of 15% while
negligibly degrading performance for SPLASH-2 and PARSEC benchmarks.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below