Using branch handling hardware to support profile-driven optimization

11Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented for profiling run 2-30 times slower, an awkward compile-run-recompile sequence is required, and a test input suite must be collected and validated for each program. This paper proposes using existing branch handling hardware to generate profile information in real time. Techniques are presented for both one-level and two-level branch hardware organizations. The approach produces high accuracy with small slowdown in execution (0.4%-4.6%). This allows a program to be profiled while it is used, eliminating the need for a test input suite. This practically removes the inconvenience of profiling. With contemporary processors driven increasingly by compiler support, hardware-based profiling is important for high-performance systems.

Cite

CITATION STYLE

APA

Conte, T. M., Patel, B. A., & Cox, J. S. (1994). Using branch handling hardware to support profile-driven optimization. In Proceedings of the Annual International Symposium on Microarchitecture, MICRO (Vol. Part F129425, pp. 12–21). IEEE Computer Society. https://doi.org/10.1145/192724.192726

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free