An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms

  • Kowarschik M
  • Weiß C
N/ACitations
Citations of this article
94Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In order to mitigate the impact of the growing gap between CPU speed and main memory performance, today’s computer architectures implement hierarchical memory structures. The idea behind this approach is to hide both the low main memory bandwidth and the latency of main memory accesses which is slow in contrast to the floating-point performance of the CPUs. Usually, there is a small and expensive high speed memory sitting on top of the hierarchy which is usually integrated within the processor chip to provide data with low latency and high bandwidth; i.e., the CPU registers. Moving further away from the CPU, the layers of memory successively become larger and slower. The memory components which are located between the processor core and main memory are called cache memories or caches. They are intended to contain copies of main memory blocks to speed up accesses to frequently needed data [378], [392]. The next lower level of the memory hierarchy is the main memory which is large but also comparatively slow. While external memory such as hard disk drives or remote memory components in a distributed computing environment represent the lower end of any common hierarchical memory design, this paper focuses on optimization techniques for enhancing cache performance.

Cite

CITATION STYLE

APA

Kowarschik, M., & Weiß, C. (2003). An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms (pp. 213–232). https://doi.org/10.1007/3-540-36574-5_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free