Memory bandwidth: The true bottleneck of SIMD multimedia performance on a superscalar processor

7Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This paper presents the performance of DSP, image and 3D applications on recent general-purpose microprocessors using streaming SIMD ISA extensions (integer and floating point). The 9 benchmarks benchmark we use for this evaluation have been optimized for DLP and caches use with SIMD extensions and data prefetch. The result of these cumulated optimizations is a speedup that ranges from 1. 9 to 7. 1. All the benchmarks were originaly computation bound and 7 becomes memory bandwidth bound with the addition of SIMD and data prefetch. Quadrupling the memory bandwidth has no effect on original kernels but improves the performance of SIMD kernels by 15-55%.

Cite

CITATION STYLE

APA

Sebot, J., & Drach-Temam, N. (2001). Memory bandwidth: The true bottleneck of SIMD multimedia performance on a superscalar processor. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2150, pp. 439–447). Springer Verlag. https://doi.org/10.1007/3-540-44681-8_63

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free