FPGAs have the native feature that reduced resource usage of single operators can be directly translated in additional parallelism. For floating-point (FP) operators, such reduced resource usage can be achieved by reducing the mantissa bit width. The work presented here pursues two objectives: First, the maximum number of operands of a parallel dot product architecture is explored experimentally on an FPGA for different custom precision FP number formats. Given the resources of this FPGA, it is shown that based on non-pipelined basic FP operators, a dot product for input vector size 21, 57 and 123 can be implemented for double-, single- and half-precision, respectively. This corresponds to a respective peak performance of 1, 3.2 and 9.9 GFlop/s. Second, it is shown that the maximum dot product peak performance as a function of used precision can be modeled by a function of the form P(p) = c1 + c2 • pc3 , given a certain type of FPGA, library and synthesis settings. Fitting experimental data to this model reveals similarities as well as differences among generations of devices. © 2011 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Mücke, M., Lesser, B., & Gansterer, W. N. (2011). Peak performance model for a custom precision floating-point dot product on FPGAs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6586 LNCS, pp. 399–406). https://doi.org/10.1007/978-3-642-21878-1_49
Mendeley helps you to discover research relevant for your work.