Theoretical peak FLOPS per instruction set: a tutorial

28Citations
Citations of this article
36Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Traditionally, evaluating the theoretical peak performance of a CPU in FLOPS (floating-point operations per second) was merely a matter of multiplying the frequency by the number of floating-point instructions per cycle. Today however, CPUs have features such as vectorization, fused multiply-add, hyperthreading, and “turbo” mode. In this tutorial, we look into this theoretical peak for recent fully featured Intel CPUs and other hardware, taking into account not only the simple absolute peak, but also the relevant instruction sets, encoding and the frequency scaling behaviour of modern hardware.

Author supplied keywords

Cite

CITATION STYLE

APA

Dolbeau, R. (2018). Theoretical peak FLOPS per instruction set: a tutorial. Journal of Supercomputing, 74(3), 1341–1377. https://doi.org/10.1007/s11227-017-2177-5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free