Introducing a performance model for bandwidth-limited loop kernels

46Citations
Citations of this article
32Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present a diagnostic performance model for bandwidth-limited loop kernels which is founded on the analysis of modern cache based microarchitectures. This model allows an accurate performance prediction and evaluation for existing instruction codes. It provides an in-depth understanding of how performance for different memory hierarchy levels is made up. The performance of raw memory load, store and copy operations and a stream vector triad are analyzed and benchmarked on three modern x86-type quad-core architectures in order to demonstrate the capabilities of the model. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Treibig, J., & Hager, G. (2010). Introducing a performance model for bandwidth-limited loop kernels. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6067 LNCS, pp. 615–624). https://doi.org/10.1007/978-3-642-14390-8_64

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free