Power/performance trade-offs of small batched LU based solvers on GPUs

28Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this paper we propose and analyze a set of batched linear solvers for small matrices on Graphic Processing Units (GPUs), evaluating the various alternatives depending on the size of the systems to solve. We discuss three different solutions that operate with different levels of parallelization and GPU features. The first, exploiting the CUBLAS library, manages matrices of size up to 32x32 and employs Warp level (one matrix, one Warp) parallelism and shared memory. The second works at Thread-block level parallelism (one matrix, one Thread-block), still exploiting shared memory but managing matrices up to 76x76. The third is Thread level parallel (one matrix, one thread) and can reach sizes up to 128x128, but it does not exploit shared memory and only relies on the high memory bandwidth of the GPU. The first and second solutions only support partial pivoting, the third one easily supports partial and full pivoting, making it attractive to problems that require greater numerical stability. We analyze the trade-offs in terms of performance and power consumption as function of the size of the linear systems that are simultaneously solved. We execute the three implementations on a Tesla M2090 (Fermi) and on a Tesla K20 (Kepler). © 2013 Springer-Verlag.

Cite

CITATION STYLE

APA

Villa, O., Fatica, M., Gawande, N., & Tumeo, A. (2013). Power/performance trade-offs of small batched LU based solvers on GPUs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8097 LNCS, pp. 813–825). https://doi.org/10.1007/978-3-642-40047-6_81

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free