Load execution latency reduction

Bryan Black; Brian Mueller; Stephanie Postal; Ryan Rakvic; Noppanunt Utamaphethai; John Paul Shen

Conference ProceedingsOPEN ACCESS

Load execution latency reduction

Proceedings of the International Conference on Supercomputing (1998) 29-36

DOI: 10.1145/277830.277842

14Citations

5Readers

Abstract

Load execution latency is dependent on memory access latency, pipeline depth, and data dependencies. Through load effective address prediction both data dependencies and deep pipeline effects can potentially be removed from the overall execution time. If a load effective address is correctly predicted, the data cache can be speculatively accessed prior to execution, thus effectively reducing the latency of load execution. A hybrid load effective address prediction technique is proposed, using three basic predictors: Last Address Predictor (LAP), Stride Predictor (SP), and Global Dynamic Predictor (GDP). In addition to improving load address prediction accuracy, this work explores the balance of data ports in the cache memory hierarchy, and the effects of load and store aliasing in wide superscalar machines. Results: Using a realistic hybrid load address predictor, load address prediction rates range from 32% to 77% averaging 51% for SPECint95 and 60% to 96% averaging 87% for SPECfp95. For a wide superscalar machine with a significant number of execution resources, this prediction rate increases IPC by 12% and 19% for SPECint95 and SPECfp95, respectively. It is also shown that load/store aliasing decreases the average IPC by 33% for SPECint95 and 24% for SPECfp95.

Cite

CITATION STYLE

APA

Black, B., Mueller, B., Postal, S., Rakvic, R., Utamaphethai, N., & Shen, J. P. (1998). Load execution latency reduction. In Proceedings of the International Conference on Supercomputing (pp. 29–36). ACM. https://doi.org/10.1145/277830.277842

Load execution latency reduction

Abstract

Cite

Register to see more suggestions