Peak performance model for a custom precision floating-point dot product on FPGAs

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

FPGAs have the native feature that reduced resource usage of single operators can be directly translated in additional parallelism. For floating-point (FP) operators, such reduced resource usage can be achieved by reducing the mantissa bit width. The work presented here pursues two objectives: First, the maximum number of operands of a parallel dot product architecture is explored experimentally on an FPGA for different custom precision FP number formats. Given the resources of this FPGA, it is shown that based on non-pipelined basic FP operators, a dot product for input vector size 21, 57 and 123 can be implemented for double-, single- and half-precision, respectively. This corresponds to a respective peak performance of 1, 3.2 and 9.9 GFlop/s. Second, it is shown that the maximum dot product peak performance as a function of used precision can be modeled by a function of the form P(p) = c1 + c2 • pc3 , given a certain type of FPGA, library and synthesis settings. Fitting experimental data to this model reveals similarities as well as differences among generations of devices. © 2011 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Mücke, M., Lesser, B., & Gansterer, W. N. (2011). Peak performance model for a custom precision floating-point dot product on FPGAs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6586 LNCS, pp. 399–406). https://doi.org/10.1007/978-3-642-21878-1_49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free