Memory bandwidth: The true bottleneck of SIMD multimedia performance on a superscalar processor

Julien Sebot; Nathalie Drach-Temam

Conference ProceedingsOPEN ACCESS

Memory bandwidth: The true bottleneck of SIMD multimedia performance on a superscalar processor

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2150 439-447

DOI: 10.1007/3-540-44681-8_63

7Citations

10Readers

Abstract

This paper presents the performance of DSP, image and 3D applications on recent general-purpose microprocessors using streaming SIMD ISA extensions (integer and floating point). The 9 benchmarks benchmark we use for this evaluation have been optimized for DLP and caches use with SIMD extensions and data prefetch. The result of these cumulated optimizations is a speedup that ranges from 1. 9 to 7. 1. All the benchmarks were originaly computation bound and 7 becomes memory bandwidth bound with the addition of SIMD and data prefetch. Quadrupling the memory bandwidth has no effect on original kernels but improves the performance of SIMD kernels by 15-55%.

Cite

CITATION STYLE

APA

Sebot, J., & Drach-Temam, N. (2001). Memory bandwidth: The true bottleneck of SIMD multimedia performance on a superscalar processor. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2150, pp. 439–447). Springer Verlag. https://doi.org/10.1007/3-540-44681-8_63

Memory bandwidth: The true bottleneck of SIMD multimedia performance on a superscalar processor

Abstract

Cite

Register to see more suggestions