Computing elementary functions on large arrays is an essential part of many machine learning and signal processing algorithms. Since the introduction of floating-point computations in mainstream processors, table lookups, division, square root, and piecewise approximations were essential components of elementary functions implementations. However, we suggest that these operations can not deliver high throughput on modern processors, and argue that algorithms which rely only on multiplication, addition, and integer operations would achieve higher performance. We propose 4 design principles for high-throughput elementary functions and suggest how to apply them to implementation of log, exp, sin, and tan functions. We evaluate the performance and accuracy of the new algorithms on three recent x86 microarchitectures and demonstrate that they compare favorably to previously published research and vendor-optimized libraries. © 2014 Springer-Verlag.
CITATION STYLE
Dukhan, M., & Vuduc, R. (2014). Methods for high-throughput computation of elementary functions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8384 LNCS, pp. 86–95). Springer Verlag. https://doi.org/10.1007/978-3-642-55224-3_9
Mendeley helps you to discover research relevant for your work.