Implementing a systolic algorithm for qr factorization on multicore clusters with PaRSEC

3Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This article introduces a new systolic algorithm for QR factorization, and its implementation on a supercomputing cluster of multicore nodes. The algorithm targets a virtual 3D-array and requires only local communications. The implementation of the algorithm uses threads at the node level, and MPI for internode communications. The complexity of the implementation is addressed with the PaRSEC software, which takes as input a parametrized dependence graph, which is derived from the algorithm, and only requires the user to decide, at the high-level, the allocation of tasks to nodes. We show that the new algorithm exhibits competitive performance with state-of-The-art QR routines on a supercomputer called Kraken, which shows that high-level programming environments, such as PaRSEC, provide a viable alternative to enhance the production of quality software on complex and hierarchical architectures. © 2014 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Aupy, G., Faverge, M., Robert, Y., Kurzak, J., Luszczek, P., & Dongarra, J. (2014). Implementing a systolic algorithm for qr factorization on multicore clusters with PaRSEC. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8374 LNCS, pp. 657–667). Springer Verlag. https://doi.org/10.1007/978-3-642-54420-0_64

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free