The large latency of memory accesses in modern computers is a key obstacle in achieving high processor utilization. To hide this latency, this paper proposes a new memory management technique that can be applied to computer architectures with three levels of memory. The technique takes advantage of access pattern information that is available at compile time by prefetching certain data elements from the higher level memory. It as well maintains certain data for a period of time to prevent unnecessary data swapping. Data locality is much improved compared with the usual pattern by partitioning the iteration space and reducing execution in each partition. These combined approaches lead to improvements in average execution times of approximately 35% over the one-level partition algorithm and more than 80% over list scheduling and hardware prefetching.
CITATION STYLE
Wang, Z., Kirkpatrick, M., & Sha, E. H. M. (2000). Optimal two level partitioning and loop scheduling for hiding memory latency for DSP applications. In Proceedings - Design Automation Conference (pp. 540–545). IEEE. https://doi.org/10.1145/337292.337571
Mendeley helps you to discover research relevant for your work.