A framework for loop distribution on limited on-chip memory processors

12Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This work proposes a framework for analyzing the flow of values and their re-use in loop nests to minimize data traffic under the constraints of limited on-chip memory capacity and dependences. Our analysis first undertakes fusion of possible loop nests intra-procedurally and then performs loop distribution. The analysis discovers the closeness factor of two statements which is a quantitative measure of data traffic saved per unit memory occupied if the statements were under the same loop nest over the case where they are under different loop nests. We then develop a greedy algorithm which traverses the program dependence graph (PDG) to group statements together under the same loop nest legally. The main idea of this greedy algorithm is to transitively generate a group of statements that can legally execute under a given loop nest that can lead to a minimum data traffic. We implemented our framework in Petit [2], a tool for dependence analysis and loop transformations. We show that the benefit due to our approach results in eliminating as much as % traffic in some cases improving overall completion time by a 23.33% for processors such as TI’s TMS320C5x.

Cite

CITATION STYLE

APA

Wang, L., Tembe, W., & Pande, S. (2000). A framework for loop distribution on limited on-chip memory processors. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1781, pp. 141–156). Springer Verlag. https://doi.org/10.1007/3-540-46423-9_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free