Dynamic memory requirement inference using Berstein basis
Available from www-2.dc.uba.ar
Page 1
Dynamic memory requirement inference using Berstein basis
Symbolic Prediction of Dynamic-Memory
Requirements?
V. Braberman1 and F. Fernandez1 and D. Garbervetsky1 and S. Yovine2 ??
1 Dep. Computacion, FCEYN, Universidad de Buenos Aires, Buenos Aires, Argentina
fvbraber,diegogg@dc.uba.ar, federico.fernandez@gmail.com
2 Verimag, Grenoble, France
sergio.yovine@imag.fr
Abstract. This work presents a technique to compute symbolic non-
linear approximations of the amount of dynamic memory required to
safely run a method in (Java-like) imperative programs. We do that for
scoped-memory management where objects are organized in regions as-
sociated with the lifetime of methods. Our approach resorts to a symbolic
non-linear optimization problem which is solved using Bernstein basis.
1 Introduction
Automatic dynamic memory management is a very powerful and useful mecha-
nism which does not come for free. Indeed, it is well known that garbage collec-
tion makes execution and response times extremely dicult to predict, mainly
because unbounded pause times. Several solutions have been proposed, either
by building garbage collectors with real-time performance, e.g. [2] (see [14] for a
survey), or by using a scope-based programming paradigm, e.g. [16,5,15,9]. How-
ever, there is still the problem of predicting how much memory a program will
need to run without crashing with an out-of-memory exception. This question
is inherently hard [17].
In a previous work we presented a technique for computing a parametric
upper-bound of the amount of memory dynamically requested by Java-like im-
perative programs [7]. The idea consists in quantifying dynamic allocations done
by a method. Given a method m with parameters p1; : : : ; pk we exhibit an al-
gorithm that computes a parametric non-linear expression over p1; : : : ; pk which
over-approximates the amount of memory allocated during the execution of m.
This bound is a symbolic over-approximation of the total amount of memory the
application requests to a virtual machine via new statements, but not the actual
amount of memory really consumed by the application. This is because memory
freed by the garbage collector is not taken into account. We also showed that
assuming a region-based memory management [16,5,15,9] where objects are or-
ganized in regions associated with computation units, the same technique allows
to obtain non-linear parametric bounds of the size of every memory region.
? Partially supported by ECOS project A06E02, ANCyT grant PICT 32440, UBA-
CyTX020 and IBM Innovations Grants.
?? Currently Invited Professor at DC, FCEYN, UBA.
Requirements?
V. Braberman1 and F. Fernandez1 and D. Garbervetsky1 and S. Yovine2 ??
1 Dep. Computacion, FCEYN, Universidad de Buenos Aires, Buenos Aires, Argentina
fvbraber,diegogg@dc.uba.ar, federico.fernandez@gmail.com
2 Verimag, Grenoble, France
sergio.yovine@imag.fr
Abstract. This work presents a technique to compute symbolic non-
linear approximations of the amount of dynamic memory required to
safely run a method in (Java-like) imperative programs. We do that for
scoped-memory management where objects are organized in regions as-
sociated with the lifetime of methods. Our approach resorts to a symbolic
non-linear optimization problem which is solved using Bernstein basis.
1 Introduction
Automatic dynamic memory management is a very powerful and useful mecha-
nism which does not come for free. Indeed, it is well known that garbage collec-
tion makes execution and response times extremely dicult to predict, mainly
because unbounded pause times. Several solutions have been proposed, either
by building garbage collectors with real-time performance, e.g. [2] (see [14] for a
survey), or by using a scope-based programming paradigm, e.g. [16,5,15,9]. How-
ever, there is still the problem of predicting how much memory a program will
need to run without crashing with an out-of-memory exception. This question
is inherently hard [17].
In a previous work we presented a technique for computing a parametric
upper-bound of the amount of memory dynamically requested by Java-like im-
perative programs [7]. The idea consists in quantifying dynamic allocations done
by a method. Given a method m with parameters p1; : : : ; pk we exhibit an al-
gorithm that computes a parametric non-linear expression over p1; : : : ; pk which
over-approximates the amount of memory allocated during the execution of m.
This bound is a symbolic over-approximation of the total amount of memory the
application requests to a virtual machine via new statements, but not the actual
amount of memory really consumed by the application. This is because memory
freed by the garbage collector is not taken into account. We also showed that
assuming a region-based memory management [16,5,15,9] where objects are or-
ganized in regions associated with computation units, the same technique allows
to obtain non-linear parametric bounds of the size of every memory region.
? Partially supported by ECOS project A06E02, ANCyT grant PICT 32440, UBA-
CyTX020 and IBM Innovations Grants.
?? Currently Invited Professor at DC, FCEYN, UBA.
Page 2
Here we propose a new technique to over-approximate the amount of memory
required to run a method (or a program). Given a method m with parameters
p1; : : : ; pk we obtain a polynomial upper-bound of the amount of memory nec-
essary to safely execute the method and all methods it calls, without running
out of memory. This polynomial can be seen as a pre-condition stating that the
method requires that much free memory to be available before executing, and
also as a certicate engaging the method is not going to use more memory than
the specied. To compute this estimation we consider memory deallocation that
may occur during the execution of the method. Basically, assuming a region-
based memory management we model all the potential congurations of regions
stacks at run-time. Since region sizes are expressed as polynomials, this model
leads to a symbolic non-linear optimization problem. This problem can be solved
using a technique using Bernstein basis [12].
To our knowledge, the technique used to infer non-linear dynamic memory re-
quirements under a region-based memory manager and its eective computation
using Bernstein basis is a novel approach to memory requirements calculus.
Applications of this set of techniques are manifold, from improvements in
memory management to the generation of parametric memory-allocation certi-
cates. These specications would enable application loaders and schedulers (e.g.,
[21]) to make decisions based on available memory resources and the memory-
consumption estimates.
Outline In section 2 we present a denition of the problem we want to solve
and some assumptions that we are making. In section 3 propose an eective
denition of a function that predict memory requirements for a scoped-based
memory management. In section 4 propose an approach to compute the memory
requirements function. In section 3 we show some experiments. In section 6 we
discuss some aspects of the technique that we would like to improve. In section 7
we discuss some related work. In section 8 we present our conclusions and future
work.
2 Problem statement
Let be a non-recursive, sequential Java-like program, given as a set M of
methods. W.l.o.g., we assume there is a distinguished method under analysis
mua 2 M, where the program starts. Each m 2 M is characterized by its
formal parameters Pm, its local variables Vm, and its body (list of statements)
stmm 2 S, where S is the set of statements (new, call, ret, ...). For the sake
of simplicity, we assume that parameters are of integer type.
A program state 2 consists of a control stack cst() 2 (M N), the
data stack val() 2 [(V [ P 7! V)] of valuations of variables and parameters
(with V =
S
m2M Vm and P =
S
m2M Pm), and the heap heap() O
2, O V,
modelled as a directed graph of objects. We write mk(), pck(), and valk() to
denote the k-th method, control location and data valuations, respectively.
An object o 2 O is said to be live in heap() i it is reachable from a variable
dened in the state, that is, there exists k such that valk()(v) has a path to o
2
required to run a method (or a program). Given a method m with parameters
p1; : : : ; pk we obtain a polynomial upper-bound of the amount of memory nec-
essary to safely execute the method and all methods it calls, without running
out of memory. This polynomial can be seen as a pre-condition stating that the
method requires that much free memory to be available before executing, and
also as a certicate engaging the method is not going to use more memory than
the specied. To compute this estimation we consider memory deallocation that
may occur during the execution of the method. Basically, assuming a region-
based memory management we model all the potential congurations of regions
stacks at run-time. Since region sizes are expressed as polynomials, this model
leads to a symbolic non-linear optimization problem. This problem can be solved
using a technique using Bernstein basis [12].
To our knowledge, the technique used to infer non-linear dynamic memory re-
quirements under a region-based memory manager and its eective computation
using Bernstein basis is a novel approach to memory requirements calculus.
Applications of this set of techniques are manifold, from improvements in
memory management to the generation of parametric memory-allocation certi-
cates. These specications would enable application loaders and schedulers (e.g.,
[21]) to make decisions based on available memory resources and the memory-
consumption estimates.
Outline In section 2 we present a denition of the problem we want to solve
and some assumptions that we are making. In section 3 propose an eective
denition of a function that predict memory requirements for a scoped-based
memory management. In section 4 propose an approach to compute the memory
requirements function. In section 3 we show some experiments. In section 6 we
discuss some aspects of the technique that we would like to improve. In section 7
we discuss some related work. In section 8 we present our conclusions and future
work.
2 Problem statement
Let be a non-recursive, sequential Java-like program, given as a set M of
methods. W.l.o.g., we assume there is a distinguished method under analysis
mua 2 M, where the program starts. Each m 2 M is characterized by its
formal parameters Pm, its local variables Vm, and its body (list of statements)
stmm 2 S, where S is the set of statements (new, call, ret, ...). For the sake
of simplicity, we assume that parameters are of integer type.
A program state 2 consists of a control stack cst() 2 (M N), the
data stack val() 2 [(V [ P 7! V)] of valuations of variables and parameters
(with V =
S
m2M Vm and P =
S
m2M Pm), and the heap heap() O
2, O V,
modelled as a directed graph of objects. We write mk(), pck(), and valk() to
denote the k-th method, control location and data valuations, respectively.
An object o 2 O is said to be live in heap() i it is reachable from a variable
dened in the state, that is, there exists k such that valk()(v) has a path to o
2
Page 3
in the heap. The set of live objects of is denoted live(). We dene dead() to
be heap() n live(). Let size(o) 1 be the amount of memory occupied by o.
memUsed() M=
P
o2heap() size(o), is the amount of memory occupied in .
The semantics is given by a deterministic transition system h;mua;!i,
where mua , is the set of initial states, such that for all 2 mua, cst() =
(mua; 0) and heap() = ;. The transition relation! is dened by the operational
semantics without deleting dead objects from the heap. For any 2 , we write
succ() to denote the state 0 such that ! 0. This relation is assumed to
be such that for any two states and that are equal except for the set of
dead objects, succ() is also equal to succ(), except for the set of dead objects.
Moreover, dead() dead(succ()), that is, it is not possible to forge pointers.
A (nite) run is a sequence = 0; 1; 2 , with i ! i+1, and
0 2 mua. We denote i the state corresponding to the i-th element of . A
run is uniquely determined by the values assigned to the parameters of mua. We
write # to denote the run determined by the parameter valuation #.
A dynamic memory manager is a function : 7! that removes from
the heap (a subset of) dead objects, that is, for all runs , () is such that for
all i 0, dead( (i)) dead(i), while keeping the rest of the state unchanged.
We call ideal manager the one such that dead(ideal(i)) = ;. From now on,
we will indistinctly denote (i) as i().
2.1 Peak consumption
The peak amount of memory consumed by a run #, is dened as follows:
peak (#)
M
= max
i
memUsed( i(#)) (2.1)
Clearly, we have that for all #, peakideal(#) peak (#), for all .
Determining the peak consumption at compile time would enable to know
\a priori" the amount of memory required to safely execute the program (from
mua) without running out of memory. However, predicting this peak is hard for
dierent reasons, as the following example tries to illustrate.
Example. Consider the program in Fig. 1. The objects allocated in method m4
at locations 12 and 13 (denoted as m4:12 and m4:13) cannot be collected when
m4 nishes its execution because they are referenced from outside. m4:12 can
be collected when m3 nishes its execution. m4:14 can be collected just at the
end of m2 or m1 when the last reference is removed. And so on. Fig. 2 depicts
the eects of allocations and deallocations on memory occupancy for dierent
executions (m0 invoked with mc = 3 and mc = 7, respectively) using the ideal
memory manager. The gure only shows memory occupation generated by ex-
plicit requests at allocation statements, that is, there is no allocation overhead
introduced by the memory manager. In the rst run the peak is reached after
m0 calls m2, and in the second one, the peak occurs after m0 calls m1. ut
This example pinpoints the issues that need to be considered for estimating
the peak: it is necessary (1) to estimate the amount of memory requested by a
3
be heap() n live(). Let size(o) 1 be the amount of memory occupied by o.
memUsed() M=
P
o2heap() size(o), is the amount of memory occupied in .
The semantics is given by a deterministic transition system h;mua;!i,
where mua , is the set of initial states, such that for all 2 mua, cst() =
(mua; 0) and heap() = ;. The transition relation! is dened by the operational
semantics without deleting dead objects from the heap. For any 2 , we write
succ() to denote the state 0 such that ! 0. This relation is assumed to
be such that for any two states and that are equal except for the set of
dead objects, succ() is also equal to succ(), except for the set of dead objects.
Moreover, dead() dead(succ()), that is, it is not possible to forge pointers.
A (nite) run is a sequence = 0; 1; 2 , with i ! i+1, and
0 2 mua. We denote i the state corresponding to the i-th element of . A
run is uniquely determined by the values assigned to the parameters of mua. We
write # to denote the run determined by the parameter valuation #.
A dynamic memory manager is a function : 7! that removes from
the heap (a subset of) dead objects, that is, for all runs , () is such that for
all i 0, dead( (i)) dead(i), while keeping the rest of the state unchanged.
We call ideal manager the one such that dead(ideal(i)) = ;. From now on,
we will indistinctly denote (i) as i().
2.1 Peak consumption
The peak amount of memory consumed by a run #, is dened as follows:
peak (#)
M
= max
i
memUsed( i(#)) (2.1)
Clearly, we have that for all #, peakideal(#) peak (#), for all .
Determining the peak consumption at compile time would enable to know
\a priori" the amount of memory required to safely execute the program (from
mua) without running out of memory. However, predicting this peak is hard for
dierent reasons, as the following example tries to illustrate.
Example. Consider the program in Fig. 1. The objects allocated in method m4
at locations 12 and 13 (denoted as m4:12 and m4:13) cannot be collected when
m4 nishes its execution because they are referenced from outside. m4:12 can
be collected when m3 nishes its execution. m4:14 can be collected just at the
end of m2 or m1 when the last reference is removed. And so on. Fig. 2 depicts
the eects of allocations and deallocations on memory occupancy for dierent
executions (m0 invoked with mc = 3 and mc = 7, respectively) using the ideal
memory manager. The gure only shows memory occupation generated by ex-
plicit requests at allocation statements, that is, there is no allocation overhead
introduced by the memory manager. In the rst run the peak is reached after
m0 calls m2, and in the second one, the peak occurs after m0 calls m1. ut
This example pinpoints the issues that need to be considered for estimating
the peak: it is necessary (1) to estimate the amount of memory requested by a
3
Page 4
void m0(int mc) {
1: m1(mc);
2: m2(3 * mc);
}
void m1(int k) {
3: B[][] dummyArr = new B[k][];
4: for (int i = 1; i <= k; i++) {
5: dummyArr[i-1]= m3(i);
}
}
void m2(int k2) {
6: B[] m3Arr=m3(k2);
}
B[] m3(int n) {
7: B[] arrB = new B[n];
8: N l = new N();
9: for (int j = 1; j <= n; j++) {
10: arrB[j-1] = m4(l,j);
}
11: return arrB;
}
B m4(N l, int v)
{
12: N c = new N();
13: c.value = new B(v);
14: c.next = l.next;
15: l.next = c;
16: return c.value;
}
Fig. 1. A sample program with its detailed call graph
method, and (2) to determine when objects will be collected, both in terms of
the parameters of mua, as dierent peak consumptions could occur at dierent
places (control locations, method instances, ...) depending on initial parameters.
2.2 Peak consumption for scoped-memory management
The ideal memory manager is optimal in terms of memory consumption. This
collector is used in works that verify memory usage certicates such as [11,3].
However, it is not well understood how to predict memory consumption for it.
In this work we follow a dierent strategy: we assume the presence of a scoped-
memory manager that reclaims memory only at the end of the execution of every
method. The collector is only allowed to claim for dead objects created during
the execution of the method (and the method it transitively calls). Objects
created in an outer scope cannot be collected by the current method and may be
reclaimed by some of the methods in the call stack. In particular, we will choose
a scoped-based memory management where objects are organized in regions and
each method has an associated region (denoted as an m-region) whose lifetime
corresponds with its associated method's lifetime [15]. To be safe, objects in a
region can point to objects in the same region or a parent region (corresponding
to a method that is in the call stack). This scoping restriction can be satised
inferring regions at compile-time by performing escape analysis [15,23,22].
4
1: m1(mc);
2: m2(3 * mc);
}
void m1(int k) {
3: B[][] dummyArr = new B[k][];
4: for (int i = 1; i <= k; i++) {
5: dummyArr[i-1]= m3(i);
}
}
void m2(int k2) {
6: B[] m3Arr=m3(k2);
}
B[] m3(int n) {
7: B[] arrB = new B[n];
8: N l = new N();
9: for (int j = 1; j <= n; j++) {
10: arrB[j-1] = m4(l,j);
}
11: return arrB;
}
B m4(N l, int v)
{
12: N c = new N();
13: c.value = new B(v);
14: c.next = l.next;
15: l.next = c;
16: return c.value;
}
Fig. 1. A sample program with its detailed call graph
method, and (2) to determine when objects will be collected, both in terms of
the parameters of mua, as dierent peak consumptions could occur at dierent
places (control locations, method instances, ...) depending on initial parameters.
2.2 Peak consumption for scoped-memory management
The ideal memory manager is optimal in terms of memory consumption. This
collector is used in works that verify memory usage certicates such as [11,3].
However, it is not well understood how to predict memory consumption for it.
In this work we follow a dierent strategy: we assume the presence of a scoped-
memory manager that reclaims memory only at the end of the execution of every
method. The collector is only allowed to claim for dead objects created during
the execution of the method (and the method it transitively calls). Objects
created in an outer scope cannot be collected by the current method and may be
reclaimed by some of the methods in the call stack. In particular, we will choose
a scoped-based memory management where objects are organized in regions and
each method has an associated region (denoted as an m-region) whose lifetime
corresponds with its associated method's lifetime [15]. To be safe, objects in a
region can point to objects in the same region or a parent region (corresponding
to a method that is in the call stack). This scoping restriction can be satised
inferring regions at compile-time by performing escape analysis [15,23,22].
4
Page 5
m0(3) m0(7)
Fig. 2. Two traces: m0(3) (above) and m0(7) (below).
Let r O be a region. We assume there is one (possibly empty) region
per method occurrence in the call stack: for any 2 , the region stack is
rst() = r1 : : : rt where m() = m1 : : :mt. The set of nodes of the heap is a
disjoint union of regions. A region-based memory manager only eliminates rt.
For all , () is as follows:
{ 0() = 0,
{ and for all i 0, i+1() is:
if stm(pct(i)) = ret, then
heap( i+1()) = heap( i()) n rt, and
rst( i+1()) = rst( i()) n rt,
otherwise i+1() = succ( i()) and rst( i+1()) is:
if the statement is a call, a new (empty) region is attached to the
method pushed on top of the stack,
if an object o is created (new), it is equal to rst( i()), except for the
region to which o is assigned to,
otherwise it remains unchanged.
Furthermore, to ensure that only dead objects are removed, the assignment of
new objects into regions is supposed to be done in such a way that on the return
of the method on the top of the stack, all objects in the top region are dead.
This holds from the fact that inter-region references comply with scoping rules.
The memory peak of a run collected by a region-based memory manager
is a function of the size of the regions. Let size(r) =
P
o2r size(o). Then:
peak (#) = maxi
X
1kjcst(#i )j
size(rstk( i(#))) (2.2)
The rest of this paper is devoted to develop a computational technique to
approximate the right-hand side of Eq. 2.2.
3 Computing peak consumption for scoped-memory
Let mua (M N) be the set of call stacks whose rst element is (mua; 0).
For 2 mua, we write [1::k] to denote the prex of length k, and k to denote
5
Fig. 2. Two traces: m0(3) (above) and m0(7) (below).
Let r O be a region. We assume there is one (possibly empty) region
per method occurrence in the call stack: for any 2 , the region stack is
rst() = r1 : : : rt where m() = m1 : : :mt. The set of nodes of the heap is a
disjoint union of regions. A region-based memory manager only eliminates rt.
For all , () is as follows:
{ 0() = 0,
{ and for all i 0, i+1() is:
if stm(pct(i)) = ret, then
heap( i+1()) = heap( i()) n rt, and
rst( i+1()) = rst( i()) n rt,
otherwise i+1() = succ( i()) and rst( i+1()) is:
if the statement is a call, a new (empty) region is attached to the
method pushed on top of the stack,
if an object o is created (new), it is equal to rst( i()), except for the
region to which o is assigned to,
otherwise it remains unchanged.
Furthermore, to ensure that only dead objects are removed, the assignment of
new objects into regions is supposed to be done in such a way that on the return
of the method on the top of the stack, all objects in the top region are dead.
This holds from the fact that inter-region references comply with scoping rules.
The memory peak of a run collected by a region-based memory manager
is a function of the size of the regions. Let size(r) =
P
o2r size(o). Then:
peak (#) = maxi
X
1kjcst(#i )j
size(rstk( i(#))) (2.2)
The rest of this paper is devoted to develop a computational technique to
approximate the right-hand side of Eq. 2.2.
3 Computing peak consumption for scoped-memory
Let mua (M N) be the set of call stacks whose rst element is (mua; 0).
For 2 mua, we write [1::k] to denote the prex of length k, and k to denote
5
Page 6
the k-th method. can be partioned according to call stacks: ; 0 2 are in
the same class i cst() = cst(0). Then, from Eq. 2.2 it follows that:
peak (#) = max
2mua
max
i st.=cst(#i )
X
1kjj
size(rstk( i(#))) (3.1)
The right-hand side of Eq. 3.1 can be over-approximated by summing up the
maximum region sizes along the run:
peak (#) max
2mua
X
1kjj
max
i st.=cst(#i )
size(rstk( i(#))) (3.2)
We can further over-approximate the right-hand side of Eq. 3.2 by considering
a partition for each stack-depth k, that is, ; 0 2 are in the same class for
depth k, i cst()[1::k] = cst(0)[1::k]. Thus:
peak (#) max
2mua
X
1kjj
max
i st.[1::k]=cst(#i )[1::k]
size(rstk( i(#))) (3.3)
Sincemua is nite and all 2 mua are of nite length, it follows that computing
an over-approximation of the peak reduces to computing maximum region sizes.
3.1 Approximating region sizes
Given a method m 2 M, let rsizem(Pm) be a function of m's parameters that
yields an over-approximation of the size of any region associated to m, that is,
for every state that contains a region rk associated with method m at position
k in the stack, size(rk) rsizem(valk()(Pm)). Thus, it follows that:
peak (#) max
2mua
X
1kjj
max
i st.[1::k]=cst(#i )[1::k]
rsizek(valk(
#
i )(Pk)) (3.4)
Eq. 3.4 does not depend on because valk(#i )(Pk) = valk( i(
#))(Pk).
The rsize functions can be computed using the techniques presented in [7].
Here, we illustrate the approach with an example.
Example. For instance, rsize functions for our motivating example are:
rsizem0 = 0
rsizem1 = size(B[])k + (size(B[]) + size(B))(
1
2
k2 +
1
2
k)
rsizem2 = (size(B[]) + size(B))k2
rsizem3 = size(N) + size(N)n
rsizem4 = 0
where size(B) and size(B[]) are the sizes of objects of type B and B[]. ut
6
the same class i cst() = cst(0). Then, from Eq. 2.2 it follows that:
peak (#) = max
2mua
max
i st.=cst(#i )
X
1kjj
size(rstk( i(#))) (3.1)
The right-hand side of Eq. 3.1 can be over-approximated by summing up the
maximum region sizes along the run:
peak (#) max
2mua
X
1kjj
max
i st.=cst(#i )
size(rstk( i(#))) (3.2)
We can further over-approximate the right-hand side of Eq. 3.2 by considering
a partition for each stack-depth k, that is, ; 0 2 are in the same class for
depth k, i cst()[1::k] = cst(0)[1::k]. Thus:
peak (#) max
2mua
X
1kjj
max
i st.[1::k]=cst(#i )[1::k]
size(rstk( i(#))) (3.3)
Sincemua is nite and all 2 mua are of nite length, it follows that computing
an over-approximation of the peak reduces to computing maximum region sizes.
3.1 Approximating region sizes
Given a method m 2 M, let rsizem(Pm) be a function of m's parameters that
yields an over-approximation of the size of any region associated to m, that is,
for every state that contains a region rk associated with method m at position
k in the stack, size(rk) rsizem(valk()(Pm)). Thus, it follows that:
peak (#) max
2mua
X
1kjj
max
i st.[1::k]=cst(#i )[1::k]
rsizek(valk(
#
i )(Pk)) (3.4)
Eq. 3.4 does not depend on because valk(#i )(Pk) = valk( i(
#))(Pk).
The rsize functions can be computed using the techniques presented in [7].
Here, we illustrate the approach with an example.
Example. For instance, rsize functions for our motivating example are:
rsizem0 = 0
rsizem1 = size(B[])k + (size(B[]) + size(B))(
1
2
k2 +
1
2
k)
rsizem2 = (size(B[]) + size(B))k2
rsizem3 = size(N) + size(N)n
rsizem4 = 0
where size(B) and size(B[]) are the sizes of objects of type B and B[]. ut
6
Page 7
3.2 Approximating maximum region sizes
Equation 3.4 still depends on the run determined by the valuation # of Pmua
to obtain the values of the parameters to evaluate rsize. These values can be
approximated by an invariant binding Pmua with the parameters of all methods
in given call chain . This binding invariant constrains the possible valuations
of variables stored in stack frames of methods invoked in . Such an invariant,
denoted Imua , can be obtained, for instance, from local invariants as in [7].
Example. For instance a valid binding invariant Im0:1:m1:5:m3mua in our example is
fk = mc; 1 i k; n = ig. ut
For m 2M and :m 2 mua, the maximum value of rsizem in all states where
:m is a prex of cst(), can be over-approximated by its maximum value over
a binding invariant I:mmua . Now, let:
maxrsize:mmua (Pmua)
M
= Maximize rsizem(Pm) sbj.to I:mmua (Pmua; Pm;W ) (3.5)
where W are local variables appearing in the methods in . Clearly,
max
i st.[1::k]=cst(#i )[1::k]
rsizek(valk(
#
i )(Pk)) maxrsize
[1::k]
mua (#) (3.6)
Example. rsizem3(n) = size(N) + size(N)n and Im0:1:m1:5:m3mua = fk = mc; 1
i k; n = ig, imply maxrsizem0m0:1:m1:5:m3(mc) = size(N) + size(N)mc. ut
Let
memRq(#) M= max
2mua
X
1kjj
maxrsize[1::k]mua (#) (3.7)
It follows that:
Theorem 1. For all #, peak (#) memRq(#).
Example. Table 1 shows maxrsize:mm0 for the example of Fig. 1. Call chain for this
example are prexes of m0:1:m1:5:m3:10:m4 and m0:2:m2:6:m3:10:m4. Since we
compute a sum over all regions for a given call chain, the sum over any prex of
these call chains will give a value which is lower or equal. Thus, it is enough to
apply the maximum to the result of these chains. Using the resulting maxrsize
expressions we can reduce memRqm0 to:
memRqm0(mc) = maxf
(size(B[]) + size(B))(
1
2
mc2 +
1
2
mc) + size(B[])mc+ size(N)(1 +mc);
(size(B[]) + size(B))3mc+ size(N)(1 + 3mc)g
Here, for simplicity, we will assume that size(T ) = 1 for all T . Then:
7
Equation 3.4 still depends on the run determined by the valuation # of Pmua
to obtain the values of the parameters to evaluate rsize. These values can be
approximated by an invariant binding Pmua with the parameters of all methods
in given call chain . This binding invariant constrains the possible valuations
of variables stored in stack frames of methods invoked in . Such an invariant,
denoted Imua , can be obtained, for instance, from local invariants as in [7].
Example. For instance a valid binding invariant Im0:1:m1:5:m3mua in our example is
fk = mc; 1 i k; n = ig. ut
For m 2M and :m 2 mua, the maximum value of rsizem in all states where
:m is a prex of cst(), can be over-approximated by its maximum value over
a binding invariant I:mmua . Now, let:
maxrsize:mmua (Pmua)
M
= Maximize rsizem(Pm) sbj.to I:mmua (Pmua; Pm;W ) (3.5)
where W are local variables appearing in the methods in . Clearly,
max
i st.[1::k]=cst(#i )[1::k]
rsizek(valk(
#
i )(Pk)) maxrsize
[1::k]
mua (#) (3.6)
Example. rsizem3(n) = size(N) + size(N)n and Im0:1:m1:5:m3mua = fk = mc; 1
i k; n = ig, imply maxrsizem0m0:1:m1:5:m3(mc) = size(N) + size(N)mc. ut
Let
memRq(#) M= max
2mua
X
1kjj
maxrsize[1::k]mua (#) (3.7)
It follows that:
Theorem 1. For all #, peak (#) memRq(#).
Example. Table 1 shows maxrsize:mm0 for the example of Fig. 1. Call chain for this
example are prexes of m0:1:m1:5:m3:10:m4 and m0:2:m2:6:m3:10:m4. Since we
compute a sum over all regions for a given call chain, the sum over any prex of
these call chains will give a value which is lower or equal. Thus, it is enough to
apply the maximum to the result of these chains. Using the resulting maxrsize
expressions we can reduce memRqm0 to:
memRqm0(mc) = maxf
(size(B[]) + size(B))(
1
2
mc2 +
1
2
mc) + size(B[])mc+ size(N)(1 +mc);
(size(B[]) + size(B))3mc+ size(N)(1 + 3mc)g
Here, for simplicity, we will assume that size(T ) = 1 for all T . Then:
7
Page 8
Table 1. Expression for function maxrsize for the example
:m Im0:m n maxrsize
:m
m0 (mc)
m0 true n 0
m0:1:m1 fk = mcg n (size(B[]) + size(B)):( 12mc
2 + 12mc) + size(B[])mc
m0:1:m1:5:m3 fmc 1; k = mc; 1 i k; n = ig n size(N) + size(N)mc
m0:1:m1:5:m3:10:m4 fmc 1; k = mc; 1 i k; n = i; 1 j n; v = jg n 0
m0:2:m2 fk2 = 3mcg n (size(B[]) + size(B))3mc
m0:2:m2:6:m3 fk2 = 3mc; n = k2g n size(N) + size(N)3mc
m0:2:m2:6:m3:10:m4 fmc 1; k2 = 2mc; n = k2; 1 j n; v = jg n 0
memRqm0(mc) = maxfmc
2 + 2mc+ 1 +mc; 6mc+ 1 + 3mcg
= 1 + maxfmc2 + 3mc; 9mcg = 1 + 3mc+ maxfmc2; 6mcg
= 1 + 3mc+
mc2 if mc < 0 _mc > 6
6mc if 0 mc 6
Fig. 3 shows that memRqm0 is an upper-bound of the actual memory require-
ments of the example of Fig. 1. It also shows how regions are created when
methods are invoked and released when they return. Light gray bars depict
rsizem1, dark gray bars rsizem3, and the others rsizem2. ut
Fig. 3. Consumption, rsize, and memRqm0 for m0(3) and m0(7).
4 Computing maxrsize and memRq
The formulation of maxrsize characterizes a non-linear maximization problem,
where the polynomial rsize represents the input and the binding invariant for the
control stack represents the restriction, whose solution is an expression in terms
of Pmua. Since our goal is to avoid expensive runtime computations, we need to
perform o-line reduction as much as possible at compile time. O-line calcula-
tion also means that the problem must be stated and solved symbolically. As a
consequence, it is not possible to resort to non-linear numerical optimization.
8
:m Im0:m n maxrsize
:m
m0 (mc)
m0 true n 0
m0:1:m1 fk = mcg n (size(B[]) + size(B)):( 12mc
2 + 12mc) + size(B[])mc
m0:1:m1:5:m3 fmc 1; k = mc; 1 i k; n = ig n size(N) + size(N)mc
m0:1:m1:5:m3:10:m4 fmc 1; k = mc; 1 i k; n = i; 1 j n; v = jg n 0
m0:2:m2 fk2 = 3mcg n (size(B[]) + size(B))3mc
m0:2:m2:6:m3 fk2 = 3mc; n = k2g n size(N) + size(N)3mc
m0:2:m2:6:m3:10:m4 fmc 1; k2 = 2mc; n = k2; 1 j n; v = jg n 0
memRqm0(mc) = maxfmc
2 + 2mc+ 1 +mc; 6mc+ 1 + 3mcg
= 1 + maxfmc2 + 3mc; 9mcg = 1 + 3mc+ maxfmc2; 6mcg
= 1 + 3mc+
mc2 if mc < 0 _mc > 6
6mc if 0 mc 6
Fig. 3 shows that memRqm0 is an upper-bound of the actual memory require-
ments of the example of Fig. 1. It also shows how regions are created when
methods are invoked and released when they return. Light gray bars depict
rsizem1, dark gray bars rsizem3, and the others rsizem2. ut
Fig. 3. Consumption, rsize, and memRqm0 for m0(3) and m0(7).
4 Computing maxrsize and memRq
The formulation of maxrsize characterizes a non-linear maximization problem,
where the polynomial rsize represents the input and the binding invariant for the
control stack represents the restriction, whose solution is an expression in terms
of Pmua. Since our goal is to avoid expensive runtime computations, we need to
perform o-line reduction as much as possible at compile time. O-line calcula-
tion also means that the problem must be stated and solved symbolically. As a
consequence, it is not possible to resort to non-linear numerical optimization.
8
Page 9
4.1 Computing maxrsize
To compute maxrsize, we resort to [12] which proposes an extension of Bernstein
expansion [4] for symbolically bounding the range of a polynomial over a linear
domain. The idea is as follows: Bernstein polynomials form a basis for the space
of polynomials, such that the coecients of a polynomial in Berstein basis give
minimum and maximum bounds on the polynomial values. Here, we do not go
into the details, which can be found in [13]. Let ~x = (x1; : : : ; xk) be a vector
of variables, and ~p = (p1; : : : ; pn) a vector of parameters. Given a polynomial
pol 2 Q[~x], and a convex polytope I 2 Qj~x~pj, there is a function
Ber : Q[~x]Qj~x~pj 7! 2Q
j~pj2Q[~p]
that yields a set f(Di; Ci)gi2[1;l] where Di 2 Qj~pj is a linear domain and Ci Q[~p]
is a set of \candidate" polynomials such that, for all p 2 Qn:
maxfpol(x) j I(p;x)g
8
<
:
maxfq(p) 2 C1g if D1(p)
:::
maxfq(p) 2 Clg if Dl(p)
(4.1)
We compute maxrsize by applying the Bernstein expansion to rsize, con-
strained by a linear binding invariant, with parameters Pmua. Di. That is:
maxrsize:mmua = Ber(rsizem; I
:m
mua ) (4.2)
Example. Table 2 shows the results for the example in Fig. 1. ut
Table 2. Computing the function maxrsize using Bernstein basis
maxrsizem0:1:m1:5:m3m0 = Ber(I
m0
m0:1:m1:5:m3; rsizem3)
Domain: fmc 1g Candidates: fmc+ 1g
Domain: fmc < 1g Candidates: f0g
maxrsizem0:2:m2:6:m3m0 = Ber(I
m0
m0:2:m2:6:m3; rsizem3)
Domain: true Candidates: f3mcg
maxrsizem0:1:m1m0 = Ber(I
m0
m0:1:m1; rsizem1)
Domain: true Candidates: fmc2 + 2mcg
maxrsizem0:2:m2m0 = Ber(I
m0
m0:2:m2; rsizem2)
Domain: true Candidates: f6mcg
4.2 Evaluating memRq
Recall that memRq(#) (Eq. 3.7) is basically a comparison between values to
choose the largest one. Here we present an alternative denition of memRq where
instead of comparing all potential call chains we recursively generate an evalua-
tion tree that gets the same results: memRqmua = memRq
mua
mua where
memRq:mmua = maxrsize
:m
mua + max
(m;l;mi)2Emua
memRq:m:l:mimua (4.3)
where Emua is the set of edges of the call graph rooted at mua.
9
To compute maxrsize, we resort to [12] which proposes an extension of Bernstein
expansion [4] for symbolically bounding the range of a polynomial over a linear
domain. The idea is as follows: Bernstein polynomials form a basis for the space
of polynomials, such that the coecients of a polynomial in Berstein basis give
minimum and maximum bounds on the polynomial values. Here, we do not go
into the details, which can be found in [13]. Let ~x = (x1; : : : ; xk) be a vector
of variables, and ~p = (p1; : : : ; pn) a vector of parameters. Given a polynomial
pol 2 Q[~x], and a convex polytope I 2 Qj~x~pj, there is a function
Ber : Q[~x]Qj~x~pj 7! 2Q
j~pj2Q[~p]
that yields a set f(Di; Ci)gi2[1;l] where Di 2 Qj~pj is a linear domain and Ci Q[~p]
is a set of \candidate" polynomials such that, for all p 2 Qn:
maxfpol(x) j I(p;x)g
8
<
:
maxfq(p) 2 C1g if D1(p)
:::
maxfq(p) 2 Clg if Dl(p)
(4.1)
We compute maxrsize by applying the Bernstein expansion to rsize, con-
strained by a linear binding invariant, with parameters Pmua. Di. That is:
maxrsize:mmua = Ber(rsizem; I
:m
mua ) (4.2)
Example. Table 2 shows the results for the example in Fig. 1. ut
Table 2. Computing the function maxrsize using Bernstein basis
maxrsizem0:1:m1:5:m3m0 = Ber(I
m0
m0:1:m1:5:m3; rsizem3)
Domain: fmc 1g Candidates: fmc+ 1g
Domain: fmc < 1g Candidates: f0g
maxrsizem0:2:m2:6:m3m0 = Ber(I
m0
m0:2:m2:6:m3; rsizem3)
Domain: true Candidates: f3mcg
maxrsizem0:1:m1m0 = Ber(I
m0
m0:1:m1; rsizem1)
Domain: true Candidates: fmc2 + 2mcg
maxrsizem0:2:m2m0 = Ber(I
m0
m0:2:m2; rsizem2)
Domain: true Candidates: f6mcg
4.2 Evaluating memRq
Recall that memRq(#) (Eq. 3.7) is basically a comparison between values to
choose the largest one. Here we present an alternative denition of memRq where
instead of comparing all potential call chains we recursively generate an evalua-
tion tree that gets the same results: memRqmua = memRq
mua
mua where
memRq:mmua = maxrsize
:m
mua + max
(m;l;mi)2Emua
memRq:m:l:mimua (4.3)
where Emua is the set of edges of the call graph rooted at mua.
9
Page 10
Fig. 4. Evaluation tree of the computation of the amount of memory required
to run m0 and its correlation with the application (unfolded) call graph
Evaluation trees enable symbolic compile-time manipulation in order to re-
duce the runtime eort of evaluation of the peak consumption prediction. An
evaluation tree for our example is presented in Fig. 4. The tree has a direct rela-
tion with the application call graph: max nodes are associated with branches in
the call graph (i.e. independent regions); sum nodes are related with adjacencies
in the call graph (i.e. regions that can live at the same time); leaves are associ-
ated with each node of the unfolded call graph (i.e. potential memory regions)
by using maxrsize as the operation that yields the largest region size. To model
the output of Ber for maxrsize evaluation, trees feature a case node (not shown
in Fig. 4) which provides a more general construction and models a set of pairs
(condition, evaluation-tree). Evaluation trees can be easily evaluated at runtime.
Nevertheless, reductions can be done at compile-time, for instance, by applying
powerful symbolic techniques for polynomial manipulation or by assuming some
loss of precision of the upper-bounds. Details can be found in [6].
Example Fig. 5 shows the evolution of the evaluation tree for the example Fig. 1.
The rst tree is the evaluation tree after applying Ber for solving the maximiza-
tion problem for maxrsize. The next two trees are successive simplications. To
go from the rst tree to the second one, we start by removing the Case node
by taking directly the case m + 1 by using the fact that the binding invariant
forces mc 1. Then, we sum the nodes in the left part of the max node get-
ting the expression 1 + 3mc + mc2. Since 3mc appears also in the right side
(6mc = 3mc + 3mc) we can factor that node. Then, to move from the second
tree to the third we convert the max node to a Case node after nding the inter-
val where one polynomial mc2 is above 6mc and vice versa. ut
Finally, evaluation trees are translated into Java code to be executed at runtime.
Assuming the number of domains and candidates is xed, in the worst case, the
number of evaluations is bounded by the number of branches of the call graph
(i.e., the number of edges of a compacted version of the call graph). In practice
the size of the evaluation tree is much smaller than the call graph, since many
of the comparisons can be solved o-line.
10
to run m0 and its correlation with the application (unfolded) call graph
Evaluation trees enable symbolic compile-time manipulation in order to re-
duce the runtime eort of evaluation of the peak consumption prediction. An
evaluation tree for our example is presented in Fig. 4. The tree has a direct rela-
tion with the application call graph: max nodes are associated with branches in
the call graph (i.e. independent regions); sum nodes are related with adjacencies
in the call graph (i.e. regions that can live at the same time); leaves are associ-
ated with each node of the unfolded call graph (i.e. potential memory regions)
by using maxrsize as the operation that yields the largest region size. To model
the output of Ber for maxrsize evaluation, trees feature a case node (not shown
in Fig. 4) which provides a more general construction and models a set of pairs
(condition, evaluation-tree). Evaluation trees can be easily evaluated at runtime.
Nevertheless, reductions can be done at compile-time, for instance, by applying
powerful symbolic techniques for polynomial manipulation or by assuming some
loss of precision of the upper-bounds. Details can be found in [6].
Example Fig. 5 shows the evolution of the evaluation tree for the example Fig. 1.
The rst tree is the evaluation tree after applying Ber for solving the maximiza-
tion problem for maxrsize. The next two trees are successive simplications. To
go from the rst tree to the second one, we start by removing the Case node
by taking directly the case m + 1 by using the fact that the binding invariant
forces mc 1. Then, we sum the nodes in the left part of the max node get-
ting the expression 1 + 3mc + mc2. Since 3mc appears also in the right side
(6mc = 3mc + 3mc) we can factor that node. Then, to move from the second
tree to the third we convert the max node to a Case node after nding the inter-
val where one polynomial mc2 is above 6mc and vice versa. ut
Finally, evaluation trees are translated into Java code to be executed at runtime.
Assuming the number of domains and candidates is xed, in the worst case, the
number of evaluations is bounded by the number of branches of the call graph
(i.e., the number of edges of a compacted version of the call graph). In practice
the size of the evaluation tree is much smaller than the call graph, since many
of the comparisons can be solved o-line.
10
Page 11
Fig. 5. Evaluation tree after computing maxrsize and two successive reductions.
5 Preliminary experiments
The initial set of experiments were carried out on a subset of programs from
JOlden [8] benchmarks. It is worth mentioning that these are classical bench-
marks and they are not biased towards embedded and loop intensive applications
the target application classes we had in mind when we devised the technique.
In order to make the result more readable, the tool computes the number of ob-
ject instances created when running the selected method, rather than the actual
memory allocated by the execution of the method. Table 3 shows the computed
peak expressions, and the comparison between real executions and estimations
obtained by evaluating the polynomials. The last column shows the relative error
((#Objs - Estimation)/Estimation).
Table 3. Experimental results
Example memRq Param. #Objs Estimation Err%
MST(nv) 1 + 94nv
2 + 3nv + 5 + maxfnv 1; 2g 10 253 270 6%
20 943 985 4%
100 22703 22905 1%
1000 2252003 2254005 0%
Em3d(nN; nD) 6nN:nD + 2nN + 14 + maxf6; 2nNg (10,5) 344 354 3%
(20,6) 804 814 1%
(100,7) 4604 4614 0%
(1000,8) 52004 52014 0%
BiSort(n) 6 + n 10 13 16 19%
20 21 26 19%
200 69 135 45%
64 69 70 1%
128 133 134 1%
Power() 32656 - 32420 32656 1%
These experiments show that the technique produced quite accurate results,
actually yielding almost exact gures in most benchmarks. In some cases, the
over-approximation was due to the presence of allocations associated with ex-
ceptions (which did not occur in the real execution), or because the number of
instances could not be expressed as a polynomial. For instance, in the bisort
example, the reason of the over-approximation is that the actual number of in-
stances is always bounded by 2i 1, with i = blog2nc. Indeed, the estimation
was exact for arguments power of 2.
11
5 Preliminary experiments
The initial set of experiments were carried out on a subset of programs from
JOlden [8] benchmarks. It is worth mentioning that these are classical bench-
marks and they are not biased towards embedded and loop intensive applications
the target application classes we had in mind when we devised the technique.
In order to make the result more readable, the tool computes the number of ob-
ject instances created when running the selected method, rather than the actual
memory allocated by the execution of the method. Table 3 shows the computed
peak expressions, and the comparison between real executions and estimations
obtained by evaluating the polynomials. The last column shows the relative error
((#Objs - Estimation)/Estimation).
Table 3. Experimental results
Example memRq Param. #Objs Estimation Err%
MST(nv) 1 + 94nv
2 + 3nv + 5 + maxfnv 1; 2g 10 253 270 6%
20 943 985 4%
100 22703 22905 1%
1000 2252003 2254005 0%
Em3d(nN; nD) 6nN:nD + 2nN + 14 + maxf6; 2nNg (10,5) 344 354 3%
(20,6) 804 814 1%
(100,7) 4604 4614 0%
(1000,8) 52004 52014 0%
BiSort(n) 6 + n 10 13 16 19%
20 21 26 19%
200 69 135 45%
64 69 70 1%
128 133 134 1%
Power() 32656 - 32420 32656 1%
These experiments show that the technique produced quite accurate results,
actually yielding almost exact gures in most benchmarks. In some cases, the
over-approximation was due to the presence of allocations associated with ex-
ceptions (which did not occur in the real execution), or because the number of
instances could not be expressed as a polynomial. For instance, in the bisort
example, the reason of the over-approximation is that the actual number of in-
stances is always bounded by 2i 1, with i = blog2nc. Indeed, the estimation
was exact for arguments power of 2.
11
Page 12
6 Discussion
Peak consumption for an arbitrary method: Our technique assumes that
mua is the program's starting method. If we want to use our technique for an
arbitrary method m we need to consider the fact that m (or some method it
may transitively call) may allocate some object whose lifetime exceeds its m's
lifetime. In this case, objects should be allocated in some region of a caller of m.
To deal with this situation we introduce a new function denoted memEscapesm
which yields an over-approximation, in terms of Pm, of the amount of dynamic
memory allocated by objects created during the execution of m that cannot
be released, meaning that they have to be allocated in other callers' regions.
memEscapes provides useful information to callers of m as they must consider
that the call to m will require some additional space of their own regions. As we
do for rsize in [7] we propose technique to automatically infer memEscapes.
To cope with this extension, we only need to take into account the amount of
memory escaping m as escape information is absorbent. By absorbent we mean
that any object escaping the scope of a method m0, transitively called by m, is
eventually captured by some method in the call stack m; : : : ;m0. Thus, memRq
is redened as follows:
memRqmua
M
= memEscapesmua + memRq
mua
mua (6.1)
About the parameterization of memRq: We dened peak in terms of Pmua
and we assumed Pmua are of integer type. Nevertheless, a method may be invoke
with parameter of more complex data types and consumption may be more di-
rectly related with other expressions derivable from the parameters. For instance,
suppose that we want to know the amount of memory required to run a method
clone(c: Collection) that returns a fresh copy of a collection c. Clearly, the
size of c is relevant for computing the memory requirements. To cope with this,
we can use a new variable size for the peak calculation and relate it with c using
a predicate size = c:size().
For this, we devise an alternative denition of peak that allows introducing
new variables and a uses a predicate relating these variables with the method's
formal parameters and the objects reachable from the parameters. In practice,
this denition is supported by relating these new variables with the formal pa-
rameters using the binding invariant.
Another related issue is the fact that the method under analysis may start
with a non-empty heap, holding an input data structure (such as a collec-
tion).Nevertheless, we can easily deal with non-empty initial heaps, by modifying
the peak denition as:
peak (#)
M
= max
i
memUsed( i(#)) memUsed(#0 ) (6.2)
The rest of the derivations follow similarly.
12
Peak consumption for an arbitrary method: Our technique assumes that
mua is the program's starting method. If we want to use our technique for an
arbitrary method m we need to consider the fact that m (or some method it
may transitively call) may allocate some object whose lifetime exceeds its m's
lifetime. In this case, objects should be allocated in some region of a caller of m.
To deal with this situation we introduce a new function denoted memEscapesm
which yields an over-approximation, in terms of Pm, of the amount of dynamic
memory allocated by objects created during the execution of m that cannot
be released, meaning that they have to be allocated in other callers' regions.
memEscapes provides useful information to callers of m as they must consider
that the call to m will require some additional space of their own regions. As we
do for rsize in [7] we propose technique to automatically infer memEscapes.
To cope with this extension, we only need to take into account the amount of
memory escaping m as escape information is absorbent. By absorbent we mean
that any object escaping the scope of a method m0, transitively called by m, is
eventually captured by some method in the call stack m; : : : ;m0. Thus, memRq
is redened as follows:
memRqmua
M
= memEscapesmua + memRq
mua
mua (6.1)
About the parameterization of memRq: We dened peak in terms of Pmua
and we assumed Pmua are of integer type. Nevertheless, a method may be invoke
with parameter of more complex data types and consumption may be more di-
rectly related with other expressions derivable from the parameters. For instance,
suppose that we want to know the amount of memory required to run a method
clone(c: Collection) that returns a fresh copy of a collection c. Clearly, the
size of c is relevant for computing the memory requirements. To cope with this,
we can use a new variable size for the peak calculation and relate it with c using
a predicate size = c:size().
For this, we devise an alternative denition of peak that allows introducing
new variables and a uses a predicate relating these variables with the method's
formal parameters and the objects reachable from the parameters. In practice,
this denition is supported by relating these new variables with the formal pa-
rameters using the binding invariant.
Another related issue is the fact that the method under analysis may start
with a non-empty heap, holding an input data structure (such as a collec-
tion).Nevertheless, we can easily deal with non-empty initial heaps, by modifying
the peak denition as:
peak (#)
M
= max
i
memUsed( i(#)) memUsed(#0 ) (6.2)
The rest of the derivations follow similarly.
12
Page 13
Dealing with recursion and complex data structures: Two major short-
comings of our technique are the restriction about recursion and support for
more complex data structures. We do not allow recursion because our technique
relies on having a nite evaluation tree. Although we believe that this restriction
is acceptable for embedded systems, we are working in alternative solutions. For
instance, it is possible to provide and use peak memory-requirements specica-
tion for whole mutually recursive set of methods considering them as being only
one method. Besides, some particular cases, such as tail-recursion can be han-
dled by working with a compacted call graph, provided binding invariants are
available. This is the way the bisort case study has been analyzed in Section 3.
In [7] we present some solutions to deal with some typical iteration patterns in
collections. We are also studying the possibility of combining our technique with
approaches like [10,11] that seems to be suitable for the verication of Presburger
expressions accounting for memory consumption annotations for class methods.
We believe that it is possible to devise a technique integrating our analysis
together with those mentioned type-checking based ones. The approach would
be as follows. While methods for data container classes (like the ones provided
by standard libraries) are annotated and veried by type-checking techniques,
loop-intensive applications built on top of those veried libraries may be analyzed
using our approach. Benets are twofold: rst, work done by our technique would
be reduced since we would have to deal with signicatively smaller call graphs,
and second, our ability to synthesize non-linear consumption expressions would
entail an increase of expressive power of type-checking based techniques.
7 Related Work
The problem of dynamic memory estimation has been studied for functional
languages in [18,19,25]. The work in [18] statically infers, by typing derivation
and linear programming, linear expressions that depend on function parameters.
The technique is stated for functional programs running under a particular mem-
ory mechanism (free list of cells and explicit deallocation in pattern matching).
The computed expressions are linear constraints on the sizes of various parts of
data. Our technique is meant to work for java like programs, is better suited
for a region-based memory manager and it does compute non-linear paramet-
ric expressions. [19] proposed a variant of ML extended with region constructs
[24] together with a type system based on the notion of sized types [20] (lin-
ear constrains), such that well typed programs are proven to execute within the
given memory bounds given as linear constrains. Although, their work is meant
for rst-order functional languages, they also rely on regions to control objects
deallocation. The technique proposed in [25] consists in, given a function, con-
structing a new function that symbolically mimics the memory allocations of
the former. The computed function has to be executed over a valuation of pa-
rameters to obtain a memory bound for that assignment. The evaluation of the
bound function might not terminate, even if the original program does.
For imperative object-oriented languages, solutions have been proposed in
[17,10,11,1]. The technique of [17] manipulates symbolic arithmetic expressions
13
comings of our technique are the restriction about recursion and support for
more complex data structures. We do not allow recursion because our technique
relies on having a nite evaluation tree. Although we believe that this restriction
is acceptable for embedded systems, we are working in alternative solutions. For
instance, it is possible to provide and use peak memory-requirements specica-
tion for whole mutually recursive set of methods considering them as being only
one method. Besides, some particular cases, such as tail-recursion can be han-
dled by working with a compacted call graph, provided binding invariants are
available. This is the way the bisort case study has been analyzed in Section 3.
In [7] we present some solutions to deal with some typical iteration patterns in
collections. We are also studying the possibility of combining our technique with
approaches like [10,11] that seems to be suitable for the verication of Presburger
expressions accounting for memory consumption annotations for class methods.
We believe that it is possible to devise a technique integrating our analysis
together with those mentioned type-checking based ones. The approach would
be as follows. While methods for data container classes (like the ones provided
by standard libraries) are annotated and veried by type-checking techniques,
loop-intensive applications built on top of those veried libraries may be analyzed
using our approach. Benets are twofold: rst, work done by our technique would
be reduced since we would have to deal with signicatively smaller call graphs,
and second, our ability to synthesize non-linear consumption expressions would
entail an increase of expressive power of type-checking based techniques.
7 Related Work
The problem of dynamic memory estimation has been studied for functional
languages in [18,19,25]. The work in [18] statically infers, by typing derivation
and linear programming, linear expressions that depend on function parameters.
The technique is stated for functional programs running under a particular mem-
ory mechanism (free list of cells and explicit deallocation in pattern matching).
The computed expressions are linear constraints on the sizes of various parts of
data. Our technique is meant to work for java like programs, is better suited
for a region-based memory manager and it does compute non-linear paramet-
ric expressions. [19] proposed a variant of ML extended with region constructs
[24] together with a type system based on the notion of sized types [20] (lin-
ear constrains), such that well typed programs are proven to execute within the
given memory bounds given as linear constrains. Although, their work is meant
for rst-order functional languages, they also rely on regions to control objects
deallocation. The technique proposed in [25] consists in, given a function, con-
structing a new function that symbolically mimics the memory allocations of
the former. The computed function has to be executed over a valuation of pa-
rameters to obtain a memory bound for that assignment. The evaluation of the
bound function might not terminate, even if the original program does.
For imperative object-oriented languages, solutions have been proposed in
[17,10,11,1]. The technique of [17] manipulates symbolic arithmetic expressions
13
Page 14
on unknowns that are not necessarily program variables, but added by the anal-
ysis to represent, for instance, loop iterations. The resulting formula has to be
evaluated on an instantiation of the unknowns left to obtain the upper-bound.
No benchmarking is available to assess the impact of this technique in practice.
Nevertheless, three points may be made. Since the unknowns may not be pro-
gram inputs, it is not clear how instances are produced. Second, it seems to be
quite over-pessimistic for programs with dynamically created arrays whose size
depends on loop variables and third, it does not consider any memory collection
mechanism. The method proposed in [10,11] relies on a type system and type
annotations, similar to [19]. It does not actually synthesize memory bounds, but
statically checks whether size annotations (Presburger's formulas) are veried.
It is therefore up to the programmer to state the size constraints, which are
indeed linear. Their type system allows aliasing and object deallocation (dis-
pose) annotations. Our technique does not allow such annotations and indeed
our memory model is more restricted. But as a counterpart we we can infer non-
linear bounds. The reason we do not support individual object deallocation is our
current impossibility of computing lower bounds which are required for safely
compare the dierence between allocations and deallocations. More recently Al-
ter et al. [1] propose a technique for parametric cost analysis for sequential Java
code. The code is translated to a recursive representation with a
attened stack.
Then, they infer size relations which are similar to our linear invariants. Using
the size relation, and the recursive program representation they compute cost
relations which are set of recurrent equation in term of input parameters. Ap-
plied to memory consumption the bounds that this technique is able to infer
are not limited to polynomials. However, solving recurrence equations is not a
trivial task and is not always possible to obtain closed form solutions for a set
of recurrence equations. They outline some proposals to approximate solutions.
Object deallocation is not considered.
8 Conclusions and Future work
We presented a novel technique to compute non-linear parametric upper-bounds
of the amount of dynamic memory required by a method. The technique is devel-
oped for region-based dynamic memory management, when regions are directly
associated with methods, but it can be used safely to predict memory require-
ments for memory management mechanism that free memory by demand. The
inputs of the techniques are the application call graph enriched with binding
invariant information to constraint calling contexts, a set of parametric expres-
sions that bounds the size of every region and a mapping from allocation points
to regions (we can compute both information using the technique proposed in
[7]) and yields a parametric certicate of the memory required to run a method
(or program). This certicates are given in the form of evaluation trees that
can be easily translated to code that can be evaluated in runtime to pre-allocate
regions. The size of the evaluation trees are known at compile time and can be re-
duced either using mathematical tools to symbolically solve maximums between
polynomials or by compromising some accuracy of run-time calculations.
14
ysis to represent, for instance, loop iterations. The resulting formula has to be
evaluated on an instantiation of the unknowns left to obtain the upper-bound.
No benchmarking is available to assess the impact of this technique in practice.
Nevertheless, three points may be made. Since the unknowns may not be pro-
gram inputs, it is not clear how instances are produced. Second, it seems to be
quite over-pessimistic for programs with dynamically created arrays whose size
depends on loop variables and third, it does not consider any memory collection
mechanism. The method proposed in [10,11] relies on a type system and type
annotations, similar to [19]. It does not actually synthesize memory bounds, but
statically checks whether size annotations (Presburger's formulas) are veried.
It is therefore up to the programmer to state the size constraints, which are
indeed linear. Their type system allows aliasing and object deallocation (dis-
pose) annotations. Our technique does not allow such annotations and indeed
our memory model is more restricted. But as a counterpart we we can infer non-
linear bounds. The reason we do not support individual object deallocation is our
current impossibility of computing lower bounds which are required for safely
compare the dierence between allocations and deallocations. More recently Al-
ter et al. [1] propose a technique for parametric cost analysis for sequential Java
code. The code is translated to a recursive representation with a
attened stack.
Then, they infer size relations which are similar to our linear invariants. Using
the size relation, and the recursive program representation they compute cost
relations which are set of recurrent equation in term of input parameters. Ap-
plied to memory consumption the bounds that this technique is able to infer
are not limited to polynomials. However, solving recurrence equations is not a
trivial task and is not always possible to obtain closed form solutions for a set
of recurrence equations. They outline some proposals to approximate solutions.
Object deallocation is not considered.
8 Conclusions and Future work
We presented a novel technique to compute non-linear parametric upper-bounds
of the amount of dynamic memory required by a method. The technique is devel-
oped for region-based dynamic memory management, when regions are directly
associated with methods, but it can be used safely to predict memory require-
ments for memory management mechanism that free memory by demand. The
inputs of the techniques are the application call graph enriched with binding
invariant information to constraint calling contexts, a set of parametric expres-
sions that bounds the size of every region and a mapping from allocation points
to regions (we can compute both information using the technique proposed in
[7]) and yields a parametric certicate of the memory required to run a method
(or program). This certicates are given in the form of evaluation trees that
can be easily translated to code that can be evaluated in runtime to pre-allocate
regions. The size of the evaluation trees are known at compile time and can be re-
duced either using mathematical tools to symbolically solve maximums between
polynomials or by compromising some accuracy of run-time calculations.
14
Page 15
The precision of the technique relies on several factors: the precision of the
inputs (region sizes and invariants), the structure of the program that may allow
or disallow two active regions get its maximum size at the same time, the pre-
cision of the Bernstein approximation and reductions applied to the evaluation
tree. More benchmarks would be needed to assess its precision, but results on
JOlden let us think it is a promising approach. We are working to enhance our
technique to support recursion and to combine it with others which are better
suited for more complex or recursive data structures.
References
1. E. Albert, P. Arenas, S. Genaim, G. Puebla, and D. Zanardini. Cost analysis of java bytecode.
In ESOP 07, volume 4421 of LNCS, pages 157{172. Springer, 2007.
2. D. Bacon, P. Cheng, and D. Grove. Garbage collection for embedded systems. In EMSOFT'04,
2004.
3. G. Barthe, M. Pavlova, and G. Schneider. Precise analysis of memory consumption using
program logics. In SEFM 05, pages 86{95. IEEE Computer Society, 2005.
4. S. Bernstein. Collected Works, volume 1-2. USSR Academy of Sciences, 1954.
5. G. Bollella and J. Gosling. The Real-Time Specication for Java. Addison-Wesley Longman
Publishing Co., Inc., 2000.
6. V. Braberman, F. Fernandez, D. Garbervetsky, and S. Yovine. Symbolic prediction of dynamic
memory requirements using bernstein basis. Tech.Rep, DC, FCEyN. UBA, oct 2007.
7. V. Braberman, D. Garbervetsky, and S. Yovine. A static analysis for synthesizing parametric
specications of dynamic memory consumption. Journal of Object Technology, 5(5):31{58,
2006.
8. B. Cahoon and K. S. McKinley. Data
ow analysis for software prefetching linked data struc-
tures in java controller. In PACT 2001, pages 280{291, 2001.
9. S. Cherem and R. Rugina. Region analysis and transformation for Java programs. ISMM'04,
2004.
10. W. Chin, S. Khoo, S. Qin, C. Popeea, and H. Nguyen. Verifying safety policies with size
properties and alias controls. In ICSE 2005, 2005.
11. W. Chin, H. H. Nguyen, S. Qin, and M. Rinard. Memory usage verication for oo programs.
In SAS 05, 2005.
12. Ph. Clauss and I. Tchoupaeva. A symbolic approach to Bernstein expansion for program analysis
and optimization. In CC 04, volume 2985 of LNCS, pages 120{133. Springer, April 2004.
13. P. Clauss, F. Fernandez, D. Garbervetsky, and S. Verdoolaege. Symbolic polynomial maximiza-
tion over convex sets and its application to memory requirement estimation. Tech.Rep.06-04,
Universite Louis Pasteur, oct 2006.
14. D. Detlefs. A hard look at hard real-time garbage collection. In Proceedings of the Sev-
enth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing
(ISORC04), page 2332, Vienna, May 2004.
15. D. Garbervetsky, C. Nakhli, S. Yovine, and H. Zorgati. Program instrumentation and run-time
analysis of scoped memory in java. RV 04, ETAPS 2004, ENTCS, Barcelona, Spain, April
2004.
16. D. Gay and A. Aiken. Language support for regions. In PLDI 01, pages 70{80, 2001.
17. O. Gheorghioiu. Statically determining memory consumption of real-time java threads. MEng
thesis, Massachusetts Institute of Technology, June 2002.
18. M. Hofman and S. Jost. Static prediction of heap usage for rst-order functional programs. In
POPL 03, New Orleans, LA, January 2003.
19. J. Hughes and L. Pareto. Recursion and dynamic data-structures in bounded space: towards
embedded ml programming. In ICFP '99, 1999.
20. J. Hughes, L. Pareto, and A. Sabry. Proving the correctness of reactive systems using sized
types. In POPL '96, pages 410{423. ACM, 1996.
21. Ch. Kloukinas, Ch. Nakhli, and S. Yovine. A methodology and tool support for generating
scheduled native code for real-time java applications. In EMSOFT'03, Philadelphia, USA,
October 2003.
22. G. Salagnac, C. Rippert, and S. Yovine. Semi-automatic region-based memory management for
real-time java embedded systems. In Proceedings of 13th IEEE International Conference on
Embedded and Real-Time Computing Systems and Applications (RTCSA'07), August 2007.
23. G. Salagnac, S. Yovine, and D. Garbervetsky. Fast escape analysis for region-based memory
management. ENTCS, 131:99{110, 2005.
24. M. Tofte and J.P. Talpin. Region-based memory management. Information and Computation,
1997.
25. L. Unnikrishnan, S.D. Stoller, and Y.A. Liu. Optimized live heap bound analysis. In VMCAI
03, volume 2575 of LNCS, pages 70{85, January 2003.
15
inputs (region sizes and invariants), the structure of the program that may allow
or disallow two active regions get its maximum size at the same time, the pre-
cision of the Bernstein approximation and reductions applied to the evaluation
tree. More benchmarks would be needed to assess its precision, but results on
JOlden let us think it is a promising approach. We are working to enhance our
technique to support recursion and to combine it with others which are better
suited for more complex or recursive data structures.
References
1. E. Albert, P. Arenas, S. Genaim, G. Puebla, and D. Zanardini. Cost analysis of java bytecode.
In ESOP 07, volume 4421 of LNCS, pages 157{172. Springer, 2007.
2. D. Bacon, P. Cheng, and D. Grove. Garbage collection for embedded systems. In EMSOFT'04,
2004.
3. G. Barthe, M. Pavlova, and G. Schneider. Precise analysis of memory consumption using
program logics. In SEFM 05, pages 86{95. IEEE Computer Society, 2005.
4. S. Bernstein. Collected Works, volume 1-2. USSR Academy of Sciences, 1954.
5. G. Bollella and J. Gosling. The Real-Time Specication for Java. Addison-Wesley Longman
Publishing Co., Inc., 2000.
6. V. Braberman, F. Fernandez, D. Garbervetsky, and S. Yovine. Symbolic prediction of dynamic
memory requirements using bernstein basis. Tech.Rep, DC, FCEyN. UBA, oct 2007.
7. V. Braberman, D. Garbervetsky, and S. Yovine. A static analysis for synthesizing parametric
specications of dynamic memory consumption. Journal of Object Technology, 5(5):31{58,
2006.
8. B. Cahoon and K. S. McKinley. Data
ow analysis for software prefetching linked data struc-
tures in java controller. In PACT 2001, pages 280{291, 2001.
9. S. Cherem and R. Rugina. Region analysis and transformation for Java programs. ISMM'04,
2004.
10. W. Chin, S. Khoo, S. Qin, C. Popeea, and H. Nguyen. Verifying safety policies with size
properties and alias controls. In ICSE 2005, 2005.
11. W. Chin, H. H. Nguyen, S. Qin, and M. Rinard. Memory usage verication for oo programs.
In SAS 05, 2005.
12. Ph. Clauss and I. Tchoupaeva. A symbolic approach to Bernstein expansion for program analysis
and optimization. In CC 04, volume 2985 of LNCS, pages 120{133. Springer, April 2004.
13. P. Clauss, F. Fernandez, D. Garbervetsky, and S. Verdoolaege. Symbolic polynomial maximiza-
tion over convex sets and its application to memory requirement estimation. Tech.Rep.06-04,
Universite Louis Pasteur, oct 2006.
14. D. Detlefs. A hard look at hard real-time garbage collection. In Proceedings of the Sev-
enth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing
(ISORC04), page 2332, Vienna, May 2004.
15. D. Garbervetsky, C. Nakhli, S. Yovine, and H. Zorgati. Program instrumentation and run-time
analysis of scoped memory in java. RV 04, ETAPS 2004, ENTCS, Barcelona, Spain, April
2004.
16. D. Gay and A. Aiken. Language support for regions. In PLDI 01, pages 70{80, 2001.
17. O. Gheorghioiu. Statically determining memory consumption of real-time java threads. MEng
thesis, Massachusetts Institute of Technology, June 2002.
18. M. Hofman and S. Jost. Static prediction of heap usage for rst-order functional programs. In
POPL 03, New Orleans, LA, January 2003.
19. J. Hughes and L. Pareto. Recursion and dynamic data-structures in bounded space: towards
embedded ml programming. In ICFP '99, 1999.
20. J. Hughes, L. Pareto, and A. Sabry. Proving the correctness of reactive systems using sized
types. In POPL '96, pages 410{423. ACM, 1996.
21. Ch. Kloukinas, Ch. Nakhli, and S. Yovine. A methodology and tool support for generating
scheduled native code for real-time java applications. In EMSOFT'03, Philadelphia, USA,
October 2003.
22. G. Salagnac, C. Rippert, and S. Yovine. Semi-automatic region-based memory management for
real-time java embedded systems. In Proceedings of 13th IEEE International Conference on
Embedded and Real-Time Computing Systems and Applications (RTCSA'07), August 2007.
23. G. Salagnac, S. Yovine, and D. Garbervetsky. Fast escape analysis for region-based memory
management. ENTCS, 131:99{110, 2005.
24. M. Tofte and J.P. Talpin. Region-based memory management. Information and Computation,
1997.
25. L. Unnikrishnan, S.D. Stoller, and Y.A. Liu. Optimized live heap bound analysis. In VMCAI
03, volume 2575 of LNCS, pages 70{85, January 2003.
15
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
1 Reader on Mendeley
by Discipline
by Academic Status
100% Assistant Professor
by Country
100% Argentina


