Performance evaluation of concurrent collections on high-performance multicore computing systems

40Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper is the ?rst extensive performance study of a recently proposed parallel programming model, called Concurrent Collections (CnC). In CnC, the programmer expresses her computation in terms of application-speci?c operations, partially-ordered by semantic scheduling constraints. The CnC model is well-suited to expressing asynchronous-parallel algorithms, so we evaluate CnC using two dense linear algebra algorithms in this style for execution on state-of-the-art multicore systems: (i) a recently proposed asynchronous- parallel Cholesky factorization algorithm, (ii) a novel and non-trivial "higher-level"partly-asynchronous generalized eigensolver for dense symmetric matrices. Given a well-tuned sequential BLAS, our implementations match or exceed competing multithreadedvendor-tuned codes by up to 2.6x. Our evaluation compares with alternative models, including ScaLAPACK with a shared memory MPI, OpenMP, Cilk++, and PLASMA 2.0, on Intel Harpertown, Nehalem, and AMD Barcelona systems. Looking forward, we identify new opportunities to improve the CnC language and run-time scheduling and execution. © 2010 IEEE.

Cite

CITATION STYLE

APA

Chandramowlishwaran, A., Knobe, K., & Vuduc, R. (2010). Performance evaluation of concurrent collections on high-performance multicore computing systems. In Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010. https://doi.org/10.1109/IPDPS.2010.5470404

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free