A high throughput FPGA-Based floating point conjugate gradient implementation

Antonio Roldao Lopes; George A. Constantinides

Conference Proceedings

A high throughput FPGA-Based floating point conjugate gradient implementation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 4943 LNCS 75-86

DOI: 10.1007/978-3-540-78610-8_10

22Citations

22Readers

Get full text

Abstract

As Field Programmable Gate Arrays (FPGAs) have reached capacities beyond millions of equivalent gates, it becomes possible to accelerate floating-point scientific computing applications. One type of calculation that is commonplace in scientific computation is the solution of systems of linear equations. A method that has proven in software to be very efficient and robust for finding such solutions is the Conjugate Gradient algorithm. In this paper we present a parallel hardware Conjugate Gradient implementation. The implementation is particularly suited for accelerating multiple small to medium sized dense systems of linear equations. Through parallelization it is possible to convert the computation time per iteration for an order n matrix from Θ(n 2) cycles for a software implementation to Θ(n). I/O requirements are scalable and converge to a constant value with the increase of matrix order. Results on a VirtexII-6000 demonstrate sustained performance of 5 GFLOPS and projected results on a Virtex5-330 indicate sustained performance of 35 GFLOPS. The former result is comparable to high-end CPUs, whereas the latter represents a significant speedup. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Lopes, A. R., & Constantinides, G. A. (2008). A high throughput FPGA-Based floating point conjugate gradient implementation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4943 LNCS, pp. 75–86). https://doi.org/10.1007/978-3-540-78610-8_10

A high throughput FPGA-Based floating point conjugate gradient implementation

Abstract

Cite

Register to see more suggestions