Branching Data Structures for Real-Time Model Checking Not As Good As Thought
Available from
Esteban Pavese's profile on Mendeley.
Page 1
Branching Data Structures for Real-Time Model Checking Not As Good As Thought
Branching Data Structures for Real-Time
Model Checking Not As Good As Thought
Esteban Pavese, Gervasio Perez, and Fernando Schapachnik
Departamento de Computacion, FCEyN,
Universidad de Buenos Aires, Buenos Aires, Argentina ?
{epavese,gperez,fschapac}@dc.uba.ar
Abstract. Clock Dierence Diagrams (CDDs), a BDD-like data struc-
ture for model checking of timed automata, were presented in 1999. After
the original article the work on them seems to have stopped, although
there are still important open questions.
CDD denition required that repeated subtrees were aliased, but no
clear algorithm was presented for producing such compact representa-
tion, which seems costly to achieve. Also, since then, case studies have
increased in size.
In this article we revisit CDDs by comparing their performance against
DBMs on current case studies, with and without repeated subtrees.
Our experiments show that CDDs still require more time and memory
than DBMs, that eliminating repetitions is still not enough. Thus, this
article re-opens issues that previous work on the topic considered closed.
1 Introduction and Previous Work
In current days timed systems are both pervasive and critical, ranging from
embedded and PDAs to plant and
ight controllers. Their complexity is ever
increasing so automated ways of verifying them make sense. Automated methods,
however, are known to suer from scalability problems: their time and memory
requirements grow exponentially as systems increase in size. This is why any
technique that can palliate such problems is useful.
We focus on exploration of timed automata [1] {an extension of nite au-
tomata with the possibility to model dense time{ by on-the-
y forward reacha-
bility analysis [2]. Well-known tools in this area, such as Kronos [3] and UP-
PAAL [4], represent sets of clock valuations by means of a data structure called
Dierence Bound Matrices (DBMs for short) [5]. DBMs are (n + 1) (n + 1)
matrices, where n is the number of clocks in the system, counting 0 as a special
clock. Each cell represents the upper bound on the dierence between the cor-
responding clocks. Although Section 2 presents these concepts in more detail,
a brief description is required: said data structure provides support for a graph
reachability-like algorithm (depicted in Fig. 1) which saves the encountered sym-
bolic states in a set called Visited .
? Partially supported by UBACyT 2008 X634, UBACyT X021, ANPCyT PICT 2007
532, ANPCyT PICT 2005 32440.
Model Checking Not As Good As Thought
Esteban Pavese, Gervasio Perez, and Fernando Schapachnik
Departamento de Computacion, FCEyN,
Universidad de Buenos Aires, Buenos Aires, Argentina ?
{epavese,gperez,fschapac}@dc.uba.ar
Abstract. Clock Dierence Diagrams (CDDs), a BDD-like data struc-
ture for model checking of timed automata, were presented in 1999. After
the original article the work on them seems to have stopped, although
there are still important open questions.
CDD denition required that repeated subtrees were aliased, but no
clear algorithm was presented for producing such compact representa-
tion, which seems costly to achieve. Also, since then, case studies have
increased in size.
In this article we revisit CDDs by comparing their performance against
DBMs on current case studies, with and without repeated subtrees.
Our experiments show that CDDs still require more time and memory
than DBMs, that eliminating repetitions is still not enough. Thus, this
article re-opens issues that previous work on the topic considered closed.
1 Introduction and Previous Work
In current days timed systems are both pervasive and critical, ranging from
embedded and PDAs to plant and
ight controllers. Their complexity is ever
increasing so automated ways of verifying them make sense. Automated methods,
however, are known to suer from scalability problems: their time and memory
requirements grow exponentially as systems increase in size. This is why any
technique that can palliate such problems is useful.
We focus on exploration of timed automata [1] {an extension of nite au-
tomata with the possibility to model dense time{ by on-the-
y forward reacha-
bility analysis [2]. Well-known tools in this area, such as Kronos [3] and UP-
PAAL [4], represent sets of clock valuations by means of a data structure called
Dierence Bound Matrices (DBMs for short) [5]. DBMs are (n + 1) (n + 1)
matrices, where n is the number of clocks in the system, counting 0 as a special
clock. Each cell represents the upper bound on the dierence between the cor-
responding clocks. Although Section 2 presents these concepts in more detail,
a brief description is required: said data structure provides support for a graph
reachability-like algorithm (depicted in Fig. 1) which saves the encountered sym-
bolic states in a set called Visited .
? Partially supported by UBACyT 2008 X634, UBACyT X021, ANPCyT PICT 2007
532, ANPCyT PICT 2005 32440.
Page 2
1: function ForwardReach(Property )
2: Visited ;
3: Pending f(l0; z0)g
4: while Pending 6= ;
5: (l; z) next(Pending)
6: if (l; z) j= return YES
7: end if
8: Zl
S
(l;z0)2(Visited[Pending) z
0
9: for (l0; z0) 2 suc.(l; z)
10: if z0 * Zl
11: Add((l0; z0);Pending)
12: Add((l0; z0);Visited)
13: end if
14: end for
15: end while
16: return NO
17: end function
Fig. 1. Forward reachability algorithm.
For both warranting termination and avoiding duplicated exploration, timed
automata reachability requires knowing if a newly discovered symbolic state is
already covered by the existing ones. This is inecient when working with DBMs,
because they can only represent convex clock valuations. The traditional solution
is to employ sets of DBMs, but it is usually very hard to detect superpositions
there, leading to repetitions of calculations and other drawbacks.
There have been many attempts to overcome this problem by using BDD-like
data structures [6], which are often used in untimed systems. See, for example,
Asarin et al. [7] and Strehl et al. [8] among others. The one with the most
potential, however, was a 1999 article by Behrmann et al. In [9] they presented
Clock Dierence Diagrams (CDDs for short). Their work used DBMs for most
of the operations and CDDs for the Visited set, obtaining memory savings at
the cost of extra time.
We consider that CDDs needed to be revisited because:
{ Besides their use as a representation for the Visited set, a complete reacha-
bility algorithm that did not rely on DBMs was yet to be provided.
{ Being a tree-like data structure, they could have repeated subtrees. CDDs
require that repeated subtrees are replaced by aliasing in a maximal way
(referred to as \maximal sharing" in the original article). Because detecting
repeated subtrees can be expensive, we believe the topic requires further
analysis (we elaborate on this on Section 4).
{ The original case studies used only systems with a maximum of 5 clocks. The
verication algorithm's complexity, being O(n!2nCn), where n is number of
clocks and C is the largest constant appearing in the inequalities [10], is
dominated by the number of clocks.
2: Visited ;
3: Pending f(l0; z0)g
4: while Pending 6= ;
5: (l; z) next(Pending)
6: if (l; z) j= return YES
7: end if
8: Zl
S
(l;z0)2(Visited[Pending) z
0
9: for (l0; z0) 2 suc.(l; z)
10: if z0 * Zl
11: Add((l0; z0);Pending)
12: Add((l0; z0);Visited)
13: end if
14: end for
15: end while
16: return NO
17: end function
Fig. 1. Forward reachability algorithm.
For both warranting termination and avoiding duplicated exploration, timed
automata reachability requires knowing if a newly discovered symbolic state is
already covered by the existing ones. This is inecient when working with DBMs,
because they can only represent convex clock valuations. The traditional solution
is to employ sets of DBMs, but it is usually very hard to detect superpositions
there, leading to repetitions of calculations and other drawbacks.
There have been many attempts to overcome this problem by using BDD-like
data structures [6], which are often used in untimed systems. See, for example,
Asarin et al. [7] and Strehl et al. [8] among others. The one with the most
potential, however, was a 1999 article by Behrmann et al. In [9] they presented
Clock Dierence Diagrams (CDDs for short). Their work used DBMs for most
of the operations and CDDs for the Visited set, obtaining memory savings at
the cost of extra time.
We consider that CDDs needed to be revisited because:
{ Besides their use as a representation for the Visited set, a complete reacha-
bility algorithm that did not rely on DBMs was yet to be provided.
{ Being a tree-like data structure, they could have repeated subtrees. CDDs
require that repeated subtrees are replaced by aliasing in a maximal way
(referred to as \maximal sharing" in the original article). Because detecting
repeated subtrees can be expensive, we believe the topic requires further
analysis (we elaborate on this on Section 4).
{ The original case studies used only systems with a maximum of 5 clocks. The
verication algorithm's complexity, being O(n!2nCn), where n is number of
clocks and C is the largest constant appearing in the inequalities [10], is
dominated by the number of clocks.
Page 3
In this work, we evaluate memory and time performance of three implemen-
tations for representing non-convex state sets: sets of DBMs (which is the stan-
dard representation), (non-convex) CDDs without aliasing (i.e., with repeated
subtrees) and (non-convex) CDDs with aliasing (i.e., without repeated subtrees).
Through some well known case studies, signicant for nowadays standards,
we show that although CDDs do reduce the number of states found, operating
on them imposes an important overhead in terms of both memory and time.
The culprit is to be found in the size of non-convex CDDs, and the complexity
of operating in its unbounded branching structure, which distinctively separates
CDDs from BDDs.
Interestingly, using aliasing to suppress repeated subtrees is not enough. Al-
though our strategy to detect repeated substructures is lightweight, and eec-
tively reduces the memory footprint, time and memory measurements are still
considerably above the traditional strategy of using sets of non-convexes. Sec-
tion 4 deepens on the subject, after presenting the necessary background on
timed automata in Section 2 and CDDs in Section 3.
Relevant experimentation is presented and analyzed in Sections 5 and 6, and
Section 7 rounds up the article with our conclusions and future road map.
For completeness, CRDs, short for Clock Restriction Diagrams, should also be
mentioned. They were introduced by Wang in [11] and are similar to CDDs, but
represent explicitly only one bound. Although in this work we focus on CDDs,
Section 5 does perform a comparison against CRDs.
2 Background
Timed automata [1] are a widely used formalism to model and analyze timed
systems. They are supported by several tools such as Kronos [3], Zeus [12] or
UPPAAL [4]. Their semantics are based on labeled state-transition systems and
time-divergent runs over them. Here we present their basic notions and refer the
reader to [1, 3] for a complete formal presentation.
Denition 1 (Timed automaton). A timed automaton (TA) is a tuple A =
hL;X;;E; I; l0i, where L is a nite set of locations, X is a nite set of clocks
(non-negative real variables), is a set of labels, E is a nite set of edges,
I : L
tot
! X is a total function associating to each location a clock constraint
called the location's invariant, and l0 2 L is the initial location. Each edge in
E is a tuple hl; a; ; ; l0i, where l 2 L is the source location, l0 2 L is the
target location, a 2 is the label, 2 X is the guard, X is the set of
clocks reset at the edge. The set of clock constraints X for a set of clocks X
is dened according to the following grammar: X 3 ::= x cj ^ , where
x 2 X;2 f<;;=; >;g (although invariants restrict to f<;g) and c 2 IN.
Usually, a TA A has an associated mapping Pr : L 7! 2Props which assigns to
each location a subset of propositional variables from the set Props.
The parallel composition A k B of TAs A and B is dened using a label-
synchronized product of automata [1, 3]. At any time, the state of the system
tations for representing non-convex state sets: sets of DBMs (which is the stan-
dard representation), (non-convex) CDDs without aliasing (i.e., with repeated
subtrees) and (non-convex) CDDs with aliasing (i.e., without repeated subtrees).
Through some well known case studies, signicant for nowadays standards,
we show that although CDDs do reduce the number of states found, operating
on them imposes an important overhead in terms of both memory and time.
The culprit is to be found in the size of non-convex CDDs, and the complexity
of operating in its unbounded branching structure, which distinctively separates
CDDs from BDDs.
Interestingly, using aliasing to suppress repeated subtrees is not enough. Al-
though our strategy to detect repeated substructures is lightweight, and eec-
tively reduces the memory footprint, time and memory measurements are still
considerably above the traditional strategy of using sets of non-convexes. Sec-
tion 4 deepens on the subject, after presenting the necessary background on
timed automata in Section 2 and CDDs in Section 3.
Relevant experimentation is presented and analyzed in Sections 5 and 6, and
Section 7 rounds up the article with our conclusions and future road map.
For completeness, CRDs, short for Clock Restriction Diagrams, should also be
mentioned. They were introduced by Wang in [11] and are similar to CDDs, but
represent explicitly only one bound. Although in this work we focus on CDDs,
Section 5 does perform a comparison against CRDs.
2 Background
Timed automata [1] are a widely used formalism to model and analyze timed
systems. They are supported by several tools such as Kronos [3], Zeus [12] or
UPPAAL [4]. Their semantics are based on labeled state-transition systems and
time-divergent runs over them. Here we present their basic notions and refer the
reader to [1, 3] for a complete formal presentation.
Denition 1 (Timed automaton). A timed automaton (TA) is a tuple A =
hL;X;;E; I; l0i, where L is a nite set of locations, X is a nite set of clocks
(non-negative real variables), is a set of labels, E is a nite set of edges,
I : L
tot
! X is a total function associating to each location a clock constraint
called the location's invariant, and l0 2 L is the initial location. Each edge in
E is a tuple hl; a; ; ; l0i, where l 2 L is the source location, l0 2 L is the
target location, a 2 is the label, 2 X is the guard, X is the set of
clocks reset at the edge. The set of clock constraints X for a set of clocks X
is dened according to the following grammar: X 3 ::= x cj ^ , where
x 2 X;2 f<;;=; >;g (although invariants restrict to f<;g) and c 2 IN.
Usually, a TA A has an associated mapping Pr : L 7! 2Props which assigns to
each location a subset of propositional variables from the set Props.
The parallel composition A k B of TAs A and B is dened using a label-
synchronized product of automata [1, 3]. At any time, the state of the system
Page 4
is determined by the location and the values of clocks, which must satisfy the
location invariant. The system can evolve in two dierent ways: either an enabled
transition is taken, changing the location and (maybe) resetting some clocks
while the others keep their values unaltered (a discrete step), or it may let some
amount of time pass (a timed step). In the latter case, the system remains in the
same location and all clocks increase according to the elapsed time, while still
satisfying the location invariant.
It is important to note that, in the timed framework, the existence of real-
valued clocks generates an innite state space (locations plus clocks valuations).
Fortunately, this does not imply undecidability of many interesting problems
such as state reachability.
Denition 2 (Clock valuations). A valuation is a total function from the
clock set X into IR+ (i.e., the reading of each clock in a particular moment).
The valuation set over X, VX is dened as fv : X
tot
! IR+g. For each v 2 VX and
2 IR+, v+ stands for the valuation dened as (8x 2 X)(v+ )(x) = v(x) + .
To deal with innite state manipulation, convex sets of clock valuations
are symbolically represented as conjunctions of inequalities (e.g., 1 x 5 ^
x y > 8). Each of these conjunctions represents a convex (and innite) set
of points, and is referred to as zone. A data structure called Dierence Bound
Matrices (DBM) [5] is typically used to manipulate such kind of information.
Non-convex sets are represented as unions of convex sets (or CDDs in this arti-
cle).
Not every constraint needs to be present for all the operations. Actually,
a reduced version of the constraint systems can be used for most operations,
thus saving memory [13]. In practice, most tools use a variation of DBMs, called
Minimal Constraint Representation, which employs that idea. As these DBMs are
sparse, they are not stored like proper matrices, but as a linked list of constraints,
in order to save space. This minimal representation sets the context of this article.
That is, when we mention DBMs, we are always referring to this compact version.
Symbolic states are represented by a pair (l; z) where l is a location and z
a timed zone. Given a state1, the timed successor set is computed by the suc.
operator, dened as suc.(l; z) = f(l0; z0)= hl; a; ; ; l0i 2 E^z0 = suc (reset(z\
)) \ I(l0)g, where reset means putting the clocks in to zero and suc ( )
means replacing the constraints of the form x c by x < 1 while leaving the
rest intact. The details can be consulted in [14, Chapt. 2] or [15].
The basic (conceptual) procedure for checking reachability of property
requires a queue of symbolic states to be explored, commonly known as Pending ,
and a set of already visited states, known as Visited . The algorithm works like
this: insert the initial state in the Pending queue and initialize an empty Visited
set. Then, while Pending is not empty, take a state from it and check whether
the property holds for it. If it does, nish with a \YES" result, otherwise, put it
in Visited , and compute its (timed) successors (one for each outgoing transition
1 From now on states are implicitly symbolic.
location invariant. The system can evolve in two dierent ways: either an enabled
transition is taken, changing the location and (maybe) resetting some clocks
while the others keep their values unaltered (a discrete step), or it may let some
amount of time pass (a timed step). In the latter case, the system remains in the
same location and all clocks increase according to the elapsed time, while still
satisfying the location invariant.
It is important to note that, in the timed framework, the existence of real-
valued clocks generates an innite state space (locations plus clocks valuations).
Fortunately, this does not imply undecidability of many interesting problems
such as state reachability.
Denition 2 (Clock valuations). A valuation is a total function from the
clock set X into IR+ (i.e., the reading of each clock in a particular moment).
The valuation set over X, VX is dened as fv : X
tot
! IR+g. For each v 2 VX and
2 IR+, v+ stands for the valuation dened as (8x 2 X)(v+ )(x) = v(x) + .
To deal with innite state manipulation, convex sets of clock valuations
are symbolically represented as conjunctions of inequalities (e.g., 1 x 5 ^
x y > 8). Each of these conjunctions represents a convex (and innite) set
of points, and is referred to as zone. A data structure called Dierence Bound
Matrices (DBM) [5] is typically used to manipulate such kind of information.
Non-convex sets are represented as unions of convex sets (or CDDs in this arti-
cle).
Not every constraint needs to be present for all the operations. Actually,
a reduced version of the constraint systems can be used for most operations,
thus saving memory [13]. In practice, most tools use a variation of DBMs, called
Minimal Constraint Representation, which employs that idea. As these DBMs are
sparse, they are not stored like proper matrices, but as a linked list of constraints,
in order to save space. This minimal representation sets the context of this article.
That is, when we mention DBMs, we are always referring to this compact version.
Symbolic states are represented by a pair (l; z) where l is a location and z
a timed zone. Given a state1, the timed successor set is computed by the suc.
operator, dened as suc.(l; z) = f(l0; z0)= hl; a; ; ; l0i 2 E^z0 = suc (reset(z\
)) \ I(l0)g, where reset means putting the clocks in to zero and suc ( )
means replacing the constraints of the form x c by x < 1 while leaving the
rest intact. The details can be consulted in [14, Chapt. 2] or [15].
The basic (conceptual) procedure for checking reachability of property
requires a queue of symbolic states to be explored, commonly known as Pending ,
and a set of already visited states, known as Visited . The algorithm works like
this: insert the initial state in the Pending queue and initialize an empty Visited
set. Then, while Pending is not empty, take a state from it and check whether
the property holds for it. If it does, nish with a \YES" result, otherwise, put it
in Visited , and compute its (timed) successors (one for each outgoing transition
1 From now on states are implicitly symbolic.
Page 5
from the location). The total number of states is nite [10], but there can be
repetitions. To ensure termination, before putting them in Pending , it should be
checked that they are not included in any other state from Visited .
3 Clock Dierence Diagrams (CDDs)
In this section, we describe our implementation of Clock Dierence Diagrams
as presented in [9]. Our denitions, though in essence do not dier from the
original ones, have been tailored so that concepts discussed in this paper are
easily related to the denition.
3.1 Data Structure Denition
Denition 3 (CDD). Given a set X of clock variables, X 0 = X [f0g, a CDD
is dened by a tuple k = [di (k); Ints(k); S(k)], where:
1. di (k) 2 (X X 0) [ fTRUE ;FALSEg is the clock dierence represented
by this node. These clock dierences are extended with values TRUE and
FALSE. Nodes that hold any of these latter values are called terminal nodes.
2. Ints(k) is a list of outgoing edges. Each edge is labeled with an integral (open
or closed) interval. Ints(k)n denotes the nth list's interval, and jInts(k)j
denotes the interval list's size.
3. S(k) is a list of successor nodes. S(k)n and jS(k)j are dened in a similar
way as previously. For each 1 n jS(k)j, S(k)n yields the node reached
by traversing the edge Ints(k)n.
A CDD k such that di (k) is TRUE (or FALSE ) will usually be referred as
just TRUE (or FALSE ) for short. Similarly, given an interval I we will write
S(k; I) to denote the node (hence, the CDD) obtained by traversing the edge
labeled I from k. Formally, S(k; I) = k0 , 9i; 1 i jInts(k)j such that
Ints(k)i = I ^ S(k)i = k0.
Since CDDs are hierarchical in nature, an order on clock dierences is needed,
which can be dened as an extension of an order on clock variables.
Denition 4 (Structure invariant). The following constraints are imposed
on every node k of a CDD:
{ Whenever di (k) is TRUE or FALSE, the lists Ints(k) and S(k) must be
empty.
{ In the other case, no element of S(k) may be FALSE.
{ Given a total order on clock dierences <, it must hold that for any k0 in
S(k), either di (k) < di (k0) or k0 is TRUE.
{ Since edges are labeled by intervals, the existence of both dierences xi xj
and xj xi for any pair of clocks xi; xj is redundant. Therefore, if di (k)
is of the form xj xi it must hold that xi < xj.
repetitions. To ensure termination, before putting them in Pending , it should be
checked that they are not included in any other state from Visited .
3 Clock Dierence Diagrams (CDDs)
In this section, we describe our implementation of Clock Dierence Diagrams
as presented in [9]. Our denitions, though in essence do not dier from the
original ones, have been tailored so that concepts discussed in this paper are
easily related to the denition.
3.1 Data Structure Denition
Denition 3 (CDD). Given a set X of clock variables, X 0 = X [f0g, a CDD
is dened by a tuple k = [di (k); Ints(k); S(k)], where:
1. di (k) 2 (X X 0) [ fTRUE ;FALSEg is the clock dierence represented
by this node. These clock dierences are extended with values TRUE and
FALSE. Nodes that hold any of these latter values are called terminal nodes.
2. Ints(k) is a list of outgoing edges. Each edge is labeled with an integral (open
or closed) interval. Ints(k)n denotes the nth list's interval, and jInts(k)j
denotes the interval list's size.
3. S(k) is a list of successor nodes. S(k)n and jS(k)j are dened in a similar
way as previously. For each 1 n jS(k)j, S(k)n yields the node reached
by traversing the edge Ints(k)n.
A CDD k such that di (k) is TRUE (or FALSE ) will usually be referred as
just TRUE (or FALSE ) for short. Similarly, given an interval I we will write
S(k; I) to denote the node (hence, the CDD) obtained by traversing the edge
labeled I from k. Formally, S(k; I) = k0 , 9i; 1 i jInts(k)j such that
Ints(k)i = I ^ S(k)i = k0.
Since CDDs are hierarchical in nature, an order on clock dierences is needed,
which can be dened as an extension of an order on clock variables.
Denition 4 (Structure invariant). The following constraints are imposed
on every node k of a CDD:
{ Whenever di (k) is TRUE or FALSE, the lists Ints(k) and S(k) must be
empty.
{ In the other case, no element of S(k) may be FALSE.
{ Given a total order on clock dierences <, it must hold that for any k0 in
S(k), either di (k) < di (k0) or k0 is TRUE.
{ Since edges are labeled by intervals, the existence of both dierences xi xj
and xj xi for any pair of clocks xi; xj is redundant. Therefore, if di (k)
is of the form xj xi it must hold that xi < xj.
Page 6
{ For any two nodes k1 and k2 of a CDD (potentially the same node), and for
any intervals i1 2 Ints(k1), i2 2 Ints(k2), if it is the case that S(k1; i1) is
structurally equal to S(k2; i2), then it must be that S(k1; i1) and S(k2; i2) are
exactly the same structure instance. That is, we require complete structure
aliasing between identical substructures.
{ Every pair of intervals in di (k) must be disjoint. Moreover, di (k) must
be sorted; we say that for any two intervals I and J , I < J , 8i; j such
that i 2 I and j 2 J , it holds that i < j.
The previous denitions provide the cornerstone for timed automata veri-
cation using these hierarchical structures. The following denitions elaborate
on the structure's semantics. For this purpose, a mapping between temporal
constraints in X and CDDs needs to be dened.
Denition 5 (Mapping of temporal constraints to CDDs, r; CDD).
8 1; 2 2 X ; x; y 2 X;x < y; c 2 IN, we dene the mapping r ; CDD as
follows:
r;CDD(True) = [TRUE ; ; ] (1)
r;CDD(False) = [FALSE ; ; ] (2)
r;CDD(c < x) = [x 0; (c;1); r;CDD(True)] (3)
r;CDD(x < c) = [x 0; [ 0; c); r;CDD(True)] (4)
r;CDD(x y < c) = [x y; ( 1; c); r;CDD(True)] (5)
r;CDD( 1 ^ 2) = Intersection(r;CDD( 1); r;CDD( 2)) (6)
Mapping denitions 3, 4 and 5 have also a corresponding one for with the
interval closed to the right.
Theorem 1. The mapping r;CDD previously dened is correct with respect
to satisability, that is, 8 2 X ; v 2 VX ; v j= , v j= r;CDD( ). [16]
3.2 Algorithms
As an integral part of this work, we developed the necessary algorithms to per-
form full forward reachability based model-checking over CDD structures. The
interested reader is referenced to [17] for further discussion of these topics.
In Fig. 2 we present the union algorithm. Remember from Fig. 1 that when
a new zone z0 is found to be new it should be added to the Visited set by means
of Add((l0; z0);Visited). When Visitedl2 is represented as a non-convex CDD, the
previous Add() is, conceptually, Visitedl = Union(Visitedl; z0). For clarity and
ease of reading we present a recursive version of the algorithm, although the
actual implementation is iterative, stack-based, modies its parameters instead
of returning a new CDD, and is tailored to consider that the second parameter
does not contain branching.
2 Visitedl is the part of Visited that corresponds to location l.
any intervals i1 2 Ints(k1), i2 2 Ints(k2), if it is the case that S(k1; i1) is
structurally equal to S(k2; i2), then it must be that S(k1; i1) and S(k2; i2) are
exactly the same structure instance. That is, we require complete structure
aliasing between identical substructures.
{ Every pair of intervals in di (k) must be disjoint. Moreover, di (k) must
be sorted; we say that for any two intervals I and J , I < J , 8i; j such
that i 2 I and j 2 J , it holds that i < j.
The previous denitions provide the cornerstone for timed automata veri-
cation using these hierarchical structures. The following denitions elaborate
on the structure's semantics. For this purpose, a mapping between temporal
constraints in X and CDDs needs to be dened.
Denition 5 (Mapping of temporal constraints to CDDs, r; CDD).
8 1; 2 2 X ; x; y 2 X;x < y; c 2 IN, we dene the mapping r ; CDD as
follows:
r;CDD(True) = [TRUE ; ; ] (1)
r;CDD(False) = [FALSE ; ; ] (2)
r;CDD(c < x) = [x 0; (c;1); r;CDD(True)] (3)
r;CDD(x < c) = [x 0; [ 0; c); r;CDD(True)] (4)
r;CDD(x y < c) = [x y; ( 1; c); r;CDD(True)] (5)
r;CDD( 1 ^ 2) = Intersection(r;CDD( 1); r;CDD( 2)) (6)
Mapping denitions 3, 4 and 5 have also a corresponding one for with the
interval closed to the right.
Theorem 1. The mapping r;CDD previously dened is correct with respect
to satisability, that is, 8 2 X ; v 2 VX ; v j= , v j= r;CDD( ). [16]
3.2 Algorithms
As an integral part of this work, we developed the necessary algorithms to per-
form full forward reachability based model-checking over CDD structures. The
interested reader is referenced to [17] for further discussion of these topics.
In Fig. 2 we present the union algorithm. Remember from Fig. 1 that when
a new zone z0 is found to be new it should be added to the Visited set by means
of Add((l0; z0);Visited). When Visitedl2 is represented as a non-convex CDD, the
previous Add() is, conceptually, Visitedl = Union(Visitedl; z0). For clarity and
ease of reading we present a recursive version of the algorithm, although the
actual implementation is iterative, stack-based, modies its parameters instead
of returning a new CDD, and is tailored to consider that the second parameter
does not contain branching.
2 Visitedl is the part of Visited that corresponds to location l.
Page 7
1: function Union (rA; rB : CDD)! r0 : CDD
2: if rA = FALSE return CDDB
3: else if rA = TRUE return TRUE
4: else if di (rA) < di (rB)
5: for all I 2 Ints(rA)
6: Add(r0; I;Union(S(rA; I); rB))
7: end for
8: else if di (rA) > di (rB)
9: . . . follows as previous case.
10: else
11: for all I 2 Ints(rA); J 2 Ints(rB)
12: if :Empty(I \ J)
13: Add(r0; I \ J;Union(S(rA; I); S(rB ; J)))
14: end if
15: end for
16: end if
17: return r0
18: end function
Fig. 2. CDD union recursive algorithm.
4 Compressing Repeated Subtrees
Although the main motivation for CDDs was to obtain an appropriate data
structure for non-convex sets, some issues require special attention. We nd the
requirement for maximum sharing problematic.
Indeed, the original approach to maximum sharing described in [9] was to
keep a CDD-node cache as a hash table and try to nd an existing node there
when a new one was required. The operation was reputed as taking constant time.
However, for a node to be a suitable replacement of another, their successors
have to match. Unlike BDDs, which have xed binary branching, each CDD
node can be branched in any number of (non-overlapping) intervals. The depth
is indeed bounded by the square of the number of clocks, as each node involves
two of them3. It can clearly be seen that it doesn't seem likely to implement the
cache in constant time, so we take the term constant as meaning negligible, and
attribute that to the small sizes of the case studies originally used.
Case studies have grown since the CDD article by Behrmann et al. Current
literature examples have at least an order of magnitude more states, and many
times the number of clocks (keep in mind that timed automata reachability is
exponential in the number of clocks).
To understand the impact, either as a gain or a loss, of compressing CDDs
by the use of aliasing, we faced a number of options.
3 Note that this strategy diers from the reduce operation in OBDDs [18]. Reduce is
called once at the end of operations to recover the structure invariant that states
the nonexistence of repeated subgraphs. It works bottom up in linear time to merge
isomorphic substructures. Again, the time is linear because the branching is bounded.
2: if rA = FALSE return CDDB
3: else if rA = TRUE return TRUE
4: else if di (rA) < di (rB)
5: for all I 2 Ints(rA)
6: Add(r0; I;Union(S(rA; I); rB))
7: end for
8: else if di (rA) > di (rB)
9: . . . follows as previous case.
10: else
11: for all I 2 Ints(rA); J 2 Ints(rB)
12: if :Empty(I \ J)
13: Add(r0; I \ J;Union(S(rA; I); S(rB ; J)))
14: end if
15: end for
16: end if
17: return r0
18: end function
Fig. 2. CDD union recursive algorithm.
4 Compressing Repeated Subtrees
Although the main motivation for CDDs was to obtain an appropriate data
structure for non-convex sets, some issues require special attention. We nd the
requirement for maximum sharing problematic.
Indeed, the original approach to maximum sharing described in [9] was to
keep a CDD-node cache as a hash table and try to nd an existing node there
when a new one was required. The operation was reputed as taking constant time.
However, for a node to be a suitable replacement of another, their successors
have to match. Unlike BDDs, which have xed binary branching, each CDD
node can be branched in any number of (non-overlapping) intervals. The depth
is indeed bounded by the square of the number of clocks, as each node involves
two of them3. It can clearly be seen that it doesn't seem likely to implement the
cache in constant time, so we take the term constant as meaning negligible, and
attribute that to the small sizes of the case studies originally used.
Case studies have grown since the CDD article by Behrmann et al. Current
literature examples have at least an order of magnitude more states, and many
times the number of clocks (keep in mind that timed automata reachability is
exponential in the number of clocks).
To understand the impact, either as a gain or a loss, of compressing CDDs
by the use of aliasing, we faced a number of options.
3 Note that this strategy diers from the reduce operation in OBDDs [18]. Reduce is
called once at the end of operations to recover the structure invariant that states
the nonexistence of repeated subgraphs. It works bottom up in linear time to merge
isomorphic substructures. Again, the time is linear because the branching is bounded.
Page 8
Whatever aliases handling strategy is chosen, it must be implemented in the
Union() operation, which takes a zone found to be new and adds it to Visited
set, as seen in Fig. 2. Remember from Section 2 that the explorations nds
states (l; z), where z is a zone, that is, a convex CDD. Convex CDDs do not
contain branching. Then, (l; z) must be checked for inclusion in Visitedl, which
contains a non-convex CDD (i.e., a branching CDD). If it is indeed new, then
Union(Visitedl; z) must be performed. This is the only operation that has to
modify branched CDDs.
4.1 Handling Aliasing in the Union operation
The repetition detection could be done either on-the-
y or o-line. Doing it
on-the-
y is hard, because a generated subtree might be modied later on, as
the operation returns from the recursion4, so aliases detected might need to be
broken and whole subtrees copied.
The oine approach, which is easier to implement, mimics what is done in
BDDs: complete the operation, then make a second pass detecting aliasing and
thus compressing the CDD.
We started by the second one, and the results obtained showed no point
in also trying the rst approach: both approaches will obtain the same level of
compaction and, as will be seen in Section 5, memory overhead is still of a couple
of orders of magnitude compared to the standard version.
To achieve compression we attached a hash value to each CDD node, com-
puted as a hashing function over its elds combined with the hash values of
its descendants. A 64-bit multiplicative hash function was used. A refcount and
a boolean already compressed eld were also needed. As aliasing was only go-
ing to be detected at the end of the Union() operation, changes to implement
compression were somehow localized:
{ The Union() function maintains a hash table, mapping CDD hashes to their
pointers.
{ The compression is an in-order traversal looking each hash code in the table
and replacing the appropriate branch by another reference to the existing
one found in the table.
{ The CDD destructor also has to consider aliasing. I.e., consider the refcount
eld.
{ Aliasing is immaterial to the rest of the operations: the inclusion checking
can handle it transparently, and the others only operate on zones, which
are not aliased because they contain no branching. Conversely, non-convex
CDDs are only modied by Union() and the destructor.
The next section shows the experimentation performed. As can be seen, in
only little cases the use of CDDs was faster than DBMs. Also, the memory
usage increases up to many orders of magnitude. In Section 6 we discuss possible
threads to the validity of the method employed.
4 Our implementation is actually iterative, but the principle is the same.
Union() operation, which takes a zone found to be new and adds it to Visited
set, as seen in Fig. 2. Remember from Section 2 that the explorations nds
states (l; z), where z is a zone, that is, a convex CDD. Convex CDDs do not
contain branching. Then, (l; z) must be checked for inclusion in Visitedl, which
contains a non-convex CDD (i.e., a branching CDD). If it is indeed new, then
Union(Visitedl; z) must be performed. This is the only operation that has to
modify branched CDDs.
4.1 Handling Aliasing in the Union operation
The repetition detection could be done either on-the-
y or o-line. Doing it
on-the-
y is hard, because a generated subtree might be modied later on, as
the operation returns from the recursion4, so aliases detected might need to be
broken and whole subtrees copied.
The oine approach, which is easier to implement, mimics what is done in
BDDs: complete the operation, then make a second pass detecting aliasing and
thus compressing the CDD.
We started by the second one, and the results obtained showed no point
in also trying the rst approach: both approaches will obtain the same level of
compaction and, as will be seen in Section 5, memory overhead is still of a couple
of orders of magnitude compared to the standard version.
To achieve compression we attached a hash value to each CDD node, com-
puted as a hashing function over its elds combined with the hash values of
its descendants. A 64-bit multiplicative hash function was used. A refcount and
a boolean already compressed eld were also needed. As aliasing was only go-
ing to be detected at the end of the Union() operation, changes to implement
compression were somehow localized:
{ The Union() function maintains a hash table, mapping CDD hashes to their
pointers.
{ The compression is an in-order traversal looking each hash code in the table
and replacing the appropriate branch by another reference to the existing
one found in the table.
{ The CDD destructor also has to consider aliasing. I.e., consider the refcount
eld.
{ Aliasing is immaterial to the rest of the operations: the inclusion checking
can handle it transparently, and the others only operate on zones, which
are not aliased because they contain no branching. Conversely, non-convex
CDDs are only modied by Union() and the destructor.
The next section shows the experimentation performed. As can be seen, in
only little cases the use of CDDs was faster than DBMs. Also, the memory
usage increases up to many orders of magnitude. In Section 6 we discuss possible
threads to the validity of the method employed.
4 Our implementation is actually iterative, but the principle is the same.
Page 9
5 Experimentation
To test the data structures, we incorporated CDDs into the model checker
Zeus5 [12] and ran a series of experiments against well known case studies from
the literature.
1. RCS4 and RCS5 , the Railroad Crossing System inspired in [19] with 4 and
5 trains. The models have 8 and 9 automata, with one clock each.
2. Pipe6 , end-to-end signal propagation in a pipe-line of sporadic processes
that forward a signal emitted by a quasi periodic source, with 6 stages. 14
components, one clock each.
3. FDDI4 and FDDI8 , an extension of the FDDI token ring protocol where
the observer monitors the time the token takes to return to a given station.
The models use 9 automata with 14 clocks and 17 automata with 26 clocks,
respectively.
4. Conveyor6AB , Conveyor Belt [12] (with 6 stages and 2 objects). 11 compo-
nents totalizing 10 clocks.
5. Struct , Active Structural Control System that limits structural vibration due
to earthquakes or strong winds ([20]). 7 automata, one clock each.
Some case studies were also treated by ObsSlice [21], a safe model reducer,
and are primed on the tables. Runs are tagged with true or false depending on
whether the error state was reachable or not. All the experiments were run on an
Intel Xeon 1.6 GHz machine with 4 GB of RAM, running FreeBSD 7.0 Unix in
64-bit mode. Besides the main runs, some extra ones were done to substantiate
the claims in Section 6, which we omit for space reasons.
Table 1 reports the saving on the number of states found when using CDDs.
These savings are explained by zones that would be explored again when Visited
is a set of zones, because although their state space is already explored, it is
fragmentary covered by many other zones and thus not detected. Unluckily,
reduction in number of states does not have a correlation in neither time nor
memory.
Table 2 shows time and memory results for the three implementations con-
sidered. DBM is the standard version that uses sets of zones for the Visited
set and expresses DBMs as a packed bit structure of the minimum constraint
systems ([14, Chapt. 5]). The version that uses (non-convex) CDDs is labeled
CDD, and CDD+A when aliases detection and compression was used. When
the experiment ran out of memory (OOM) time was measured up to memory
exhaustion. All percentages are relative to the DBM version.
As can be seen in the tables, most of the cases run out of memory with
CDDs, and few of them are saved when compressing by aliasing. Compression,
however, does a good job, diminishing signicantly both memory and time, but
the results are still orders of magnitude higher than using sets of zones.
5 Although Zeus is a distributed tool, a monoprocessor version was used for this
article.
To test the data structures, we incorporated CDDs into the model checker
Zeus5 [12] and ran a series of experiments against well known case studies from
the literature.
1. RCS4 and RCS5 , the Railroad Crossing System inspired in [19] with 4 and
5 trains. The models have 8 and 9 automata, with one clock each.
2. Pipe6 , end-to-end signal propagation in a pipe-line of sporadic processes
that forward a signal emitted by a quasi periodic source, with 6 stages. 14
components, one clock each.
3. FDDI4 and FDDI8 , an extension of the FDDI token ring protocol where
the observer monitors the time the token takes to return to a given station.
The models use 9 automata with 14 clocks and 17 automata with 26 clocks,
respectively.
4. Conveyor6AB , Conveyor Belt [12] (with 6 stages and 2 objects). 11 compo-
nents totalizing 10 clocks.
5. Struct , Active Structural Control System that limits structural vibration due
to earthquakes or strong winds ([20]). 7 automata, one clock each.
Some case studies were also treated by ObsSlice [21], a safe model reducer,
and are primed on the tables. Runs are tagged with true or false depending on
whether the error state was reachable or not. All the experiments were run on an
Intel Xeon 1.6 GHz machine with 4 GB of RAM, running FreeBSD 7.0 Unix in
64-bit mode. Besides the main runs, some extra ones were done to substantiate
the claims in Section 6, which we omit for space reasons.
Table 1 reports the saving on the number of states found when using CDDs.
These savings are explained by zones that would be explored again when Visited
is a set of zones, because although their state space is already explored, it is
fragmentary covered by many other zones and thus not detected. Unluckily,
reduction in number of states does not have a correlation in neither time nor
memory.
Table 2 shows time and memory results for the three implementations con-
sidered. DBM is the standard version that uses sets of zones for the Visited
set and expresses DBMs as a packed bit structure of the minimum constraint
systems ([14, Chapt. 5]). The version that uses (non-convex) CDDs is labeled
CDD, and CDD+A when aliases detection and compression was used. When
the experiment ran out of memory (OOM) time was measured up to memory
exhaustion. All percentages are relative to the DBM version.
As can be seen in the tables, most of the cases run out of memory with
CDDs, and few of them are saved when compressing by aliasing. Compression,
however, does a good job, diminishing signicantly both memory and time, but
the results are still orders of magnitude higher than using sets of zones.
5 Although Zeus is a distributed tool, a monoprocessor version was used for this
article.
Page 10
Example DBM CDD Dierence
States found (Ratio)
FDDI8 true 9272 9272 0%
FDDI8 false 9272 9272 0%
FDDI9 true 27186 27186 0%
FDDI9 false 27186 27186 0%
RCS4 true 3337 2963 -11.2 %
RCS4 false 8274 OOM
RCS5 true 90560 OOM
RCS5 false 281547 OOM
Pipe6 ' true 52668 OOM
Pipe6 ' false 24581 OOM
Pipe6 true 727694 OOM
Pipe6 false 80280 OOM
Struct' true 677 638 -5.7%
Struct' false 6205 5495 -11.4%
Conveyor6AB true 2431 2431 0%
Conveyor6AB false 15481 OOM
Table 1. States found with and without CDDs.
It is clear from the results that the classical use of sets of zones as a repository
outperforms the CDD implementations when it comes to memory use, even when
performing full aliasing detection. This at rst may seem unintuitive, since the
idea behind tree-like structures is the sharing of common substructure and also
leveraging on branch prex sharing. However, some factors in
uence the CDD
repository structure into growing bigger than expected. In the rst place, the set
based implementation relies strongly on the reduced representation of zones, that
is, not all clock dierences need to be explicitly represented in order to accurately
render a given zone. In the case of tree-like structures, although each added zone
is indeed represented in a reduced way, the nal product may not preserve this
reduction, as many dierent zones may have various clock dierences explicitly
represented. However, once they are joined, the structure is prone to converge
to represent all clock dierences.
Another factor that in
uences structure growth is interval atomization, that
is, the phenomenon of constantly breaking up intervals into smaller components.
This results from the addition of zones that overlap in a non-constant way over
the already existing repository. For instance, refer to Fig. 3, where the spike-
shaped area depicts the projection over clocks x and y of the temporal state
space at a given point during the verication, while the rectangular area rep-
resents the projection of a new zone to be added to the repository. In a CDD
representation, the original spike-shaped repository would span three dierent
intervals for the (y y0) clock dierence. However, when adding the rectangle-
shaped zone, these three intervals are further divided into seven. Whereas in the
classic set-based implementation the zone addition is achieved just by adding
the zone representation to the set, in the tree-like representation the zone may
States found (Ratio)
FDDI8 true 9272 9272 0%
FDDI8 false 9272 9272 0%
FDDI9 true 27186 27186 0%
FDDI9 false 27186 27186 0%
RCS4 true 3337 2963 -11.2 %
RCS4 false 8274 OOM
RCS5 true 90560 OOM
RCS5 false 281547 OOM
Pipe6 ' true 52668 OOM
Pipe6 ' false 24581 OOM
Pipe6 true 727694 OOM
Pipe6 false 80280 OOM
Struct' true 677 638 -5.7%
Struct' false 6205 5495 -11.4%
Conveyor6AB true 2431 2431 0%
Conveyor6AB false 15481 OOM
Table 1. States found with and without CDDs.
It is clear from the results that the classical use of sets of zones as a repository
outperforms the CDD implementations when it comes to memory use, even when
performing full aliasing detection. This at rst may seem unintuitive, since the
idea behind tree-like structures is the sharing of common substructure and also
leveraging on branch prex sharing. However, some factors in
uence the CDD
repository structure into growing bigger than expected. In the rst place, the set
based implementation relies strongly on the reduced representation of zones, that
is, not all clock dierences need to be explicitly represented in order to accurately
render a given zone. In the case of tree-like structures, although each added zone
is indeed represented in a reduced way, the nal product may not preserve this
reduction, as many dierent zones may have various clock dierences explicitly
represented. However, once they are joined, the structure is prone to converge
to represent all clock dierences.
Another factor that in
uences structure growth is interval atomization, that
is, the phenomenon of constantly breaking up intervals into smaller components.
This results from the addition of zones that overlap in a non-constant way over
the already existing repository. For instance, refer to Fig. 3, where the spike-
shaped area depicts the projection over clocks x and y of the temporal state
space at a given point during the verication, while the rectangular area rep-
resents the projection of a new zone to be added to the repository. In a CDD
representation, the original spike-shaped repository would span three dierent
intervals for the (y y0) clock dierence. However, when adding the rectangle-
shaped zone, these three intervals are further divided into seven. Whereas in the
classic set-based implementation the zone addition is achieved just by adding
the zone representation to the set, in the tree-like representation the zone may
Page 11
Example DBM CDD CDD+A DBM CDD CDD+A
time (elapsed secs) mem. (KB)
FDDI8 true 520.23 114.11 121.03 24180 107432 81248
-78.0% -76.7% +344% +235%
FDDI8 false 520.13 113.81 120.83 24240 107464 82084
-78.1% -76.7% +343% + 2.38%
FDDI9 true 2516.22 632.31 680.72 110328 1245684 780380
-74.8% -72.9% +1029% +607%
FDDI9 false 2515.50 634.45 682.86 110388 1245716 783652
-74.7% -72.8% +1028% +610%
RCS4 true 1.93 2378.00 647.97 2204 1572572 221084
+123112% +33573% +71250% +9931%
RCS4 false 7.63 6513.07+ 6183.44+ 3376 OOM OOM
1 1
RCS5 true 47.29 22931.11 14430.81 8428 OOM OOM
1 1
RCS5 false 1597.10 22979.94 14220.28 41336 OOM OOM
1 1
Conveyor6AB true 2.04 102.93 85.23 3808 398724 123128
+5044% +4077% +10370% +3233%
Conveyor6AB false 15.56 2189.43 5553.37 11068 OOM OOM
1 1
Pipe6 ' true 118.69 4101.81 6525.58 21400 OOM OOM
1 1
Pipe6 ' false 52.99 3522.60 8397.18 17728 OOM OOM
1 1
Pipe6 true 3737.10 2780.36 4663.14 191260 OOM OOM
1 1
Pipe6 false 228.64 2978.07 5646.33 48136 OOM OOM
1 1
Struct' true 15.56 5.90 5.19 10884 31216 10880
-62.0% 66.5% 186.8% -0.0%
Struct' false 3.46 4435.72+ 15618.64 2564 OOM 3557300
1 +45130% 1 +138740%
Table 2. Results obtained with all implementations.
time (elapsed secs) mem. (KB)
FDDI8 true 520.23 114.11 121.03 24180 107432 81248
-78.0% -76.7% +344% +235%
FDDI8 false 520.13 113.81 120.83 24240 107464 82084
-78.1% -76.7% +343% + 2.38%
FDDI9 true 2516.22 632.31 680.72 110328 1245684 780380
-74.8% -72.9% +1029% +607%
FDDI9 false 2515.50 634.45 682.86 110388 1245716 783652
-74.7% -72.8% +1028% +610%
RCS4 true 1.93 2378.00 647.97 2204 1572572 221084
+123112% +33573% +71250% +9931%
RCS4 false 7.63 6513.07+ 6183.44+ 3376 OOM OOM
1 1
RCS5 true 47.29 22931.11 14430.81 8428 OOM OOM
1 1
RCS5 false 1597.10 22979.94 14220.28 41336 OOM OOM
1 1
Conveyor6AB true 2.04 102.93 85.23 3808 398724 123128
+5044% +4077% +10370% +3233%
Conveyor6AB false 15.56 2189.43 5553.37 11068 OOM OOM
1 1
Pipe6 ' true 118.69 4101.81 6525.58 21400 OOM OOM
1 1
Pipe6 ' false 52.99 3522.60 8397.18 17728 OOM OOM
1 1
Pipe6 true 3737.10 2780.36 4663.14 191260 OOM OOM
1 1
Pipe6 false 228.64 2978.07 5646.33 48136 OOM OOM
1 1
Struct' true 15.56 5.90 5.19 10884 31216 10880
-62.0% 66.5% 186.8% -0.0%
Struct' false 3.46 4435.72+ 15618.64 2564 OOM 3557300
1 +45130% 1 +138740%
Table 2. Results obtained with all implementations.
Page 12
be copied seven times (one for each interval), up to a depth dependent on the
actual size of the clock set. Although at rst these copies would surely be aliased,
further additions may dislodge these aliases from one another, eectively increas-
ing structure size. It is worth mentioning that the depth to which the zones are
replicated is sensitive to clock order in the CDD. We speculate atomization-
generating clock dierences should be pushed further down in the hierarchy as
a possible deterrent to structure size explosion.
Fig. 3. An addition of a zone to a state space repository generating multiple intervals
(projection over two clocks).
Indeed, it is well known in BDDs that the ordering of variables can have
a important impact in the size of the structure. The same can happen with
CDDs. However, nding a good ordering on clocks is a topic on its own, as in
BDDs. We only experimented with a xed ordering, probably suboptimal. We
also performed test runs based on random ordering of the specication clocks.
In these preliminary experiments, clock ordering did show a dramatic eect
on structure size (drastically reducing or increasing it), although still not being
competitive enough compared to the DBM results. There is still much research to
perform to be able to tell how a particular clock ordering in
uences the structure,
or whether this eect can be predicted by analyzing the raw system model.
The clock ordering issue for CDDs was already mentioned in [22], although the
article draws conclusions on the behaviour of CDDs based on a tool that uses
CRDs, which encode only one bound. We leave to future work researching the
importance of dierent clock orderings and nding out if there is some generic
way of determining an ordering that alleviates the huge overhead that using
CDDs seem to have.
The results we obtained contradict [9] and previous work on CRDs [11]. One
possible explanation, besides size of case studies, relies on clock order. BDD-
like structures are very sensitive to the ordering of clock variables. It could
be the case that the mentioned articles coded input automata in a way that
produced a favorable clock ordering. We were not able to test directly against
UPPAAL, as current versions do not include CDDs as a choice of data structure.
RED, the tool based on CRDs, does not perform well on our case studies. For
example, on RCS4 true, it takes 5 seconds and 10 MB of memory (more than
our DBM implementation but less than CDDs), but also reports a (spureous)
counterexample for the unreachable version. The other case studies behaved
actual size of the clock set. Although at rst these copies would surely be aliased,
further additions may dislodge these aliases from one another, eectively increas-
ing structure size. It is worth mentioning that the depth to which the zones are
replicated is sensitive to clock order in the CDD. We speculate atomization-
generating clock dierences should be pushed further down in the hierarchy as
a possible deterrent to structure size explosion.
Fig. 3. An addition of a zone to a state space repository generating multiple intervals
(projection over two clocks).
Indeed, it is well known in BDDs that the ordering of variables can have
a important impact in the size of the structure. The same can happen with
CDDs. However, nding a good ordering on clocks is a topic on its own, as in
BDDs. We only experimented with a xed ordering, probably suboptimal. We
also performed test runs based on random ordering of the specication clocks.
In these preliminary experiments, clock ordering did show a dramatic eect
on structure size (drastically reducing or increasing it), although still not being
competitive enough compared to the DBM results. There is still much research to
perform to be able to tell how a particular clock ordering in
uences the structure,
or whether this eect can be predicted by analyzing the raw system model.
The clock ordering issue for CDDs was already mentioned in [22], although the
article draws conclusions on the behaviour of CDDs based on a tool that uses
CRDs, which encode only one bound. We leave to future work researching the
importance of dierent clock orderings and nding out if there is some generic
way of determining an ordering that alleviates the huge overhead that using
CDDs seem to have.
The results we obtained contradict [9] and previous work on CRDs [11]. One
possible explanation, besides size of case studies, relies on clock order. BDD-
like structures are very sensitive to the ordering of clock variables. It could
be the case that the mentioned articles coded input automata in a way that
produced a favorable clock ordering. We were not able to test directly against
UPPAAL, as current versions do not include CDDs as a choice of data structure.
RED, the tool based on CRDs, does not perform well on our case studies. For
example, on RCS4 true, it takes 5 seconds and 10 MB of memory (more than
our DBM implementation but less than CDDs), but also reports a (spureous)
counterexample for the unreachable version. The other case studies behaved
Page 13
similarly. Such discrepancies could not be solved while corresponding with the
author, whitin the timeframe available for this article. We expect to resolve them
for the nal version of this article.
6 Threats to Validity
The technique we used is not claimed to be denitive; it is just a research pro-
totype. Nevertheless it is important to understand that deciding aliasing based
only on hash values can lead to regarding dierent subtrees as the same. In turn,
as non-convex CDDs are used to check if a newly found zone is indeed yet un-
explored, it can be the case that it is explored again without need, because a
branch was incorrectly chopped o from the CDD. On a pathological bad case
this could jeopardize termination. Being aware of the potential problem, we still
consider our results valid because:
{ In terms of time used, the technique provides a lower bound, compared to
what should be used in a \production" setting, which will require comparing
complete subtrees or any other more involved method.
{ Calculating hashes and storing them incurs by itself in an overhead, which
we measured to be around 40% in time and around 10% in memory. That
means that resorting to the full comparison method would consume at most
10% less memory, because there might be no need to store the hashes.
{ However, if hashes are not used, and get replaced by an in-depth comparison
of subtrees, time requirements scale dramatically: all our case studies would
run out of time.
{ If the model checker does terminate, the number of explored states could be
higher than the version without compression, but equally correct.
{ Although the model checker ran out of memory in some cases, it did so by the
very size of the case study. The states-found counter showed normal values
in all cases.
{ When it did terminate, the number of states found, and when appropriate,
the trace, coincided with the version without compression.
{ There is an easy way around to get exact results: use hashes to determine
if two subtrees are possible matches, and if the hashes coincide, only then
perform an in-depth comparison. Note that this approach does not save
memory, and the runtime overhead can be between 7% and 20% on the
cases where we tried it.
{ Nevertheless, we still did all of our experimentation only using hashes, be-
cause resorting to in-depth comparisons would only increase time and pos-
sibly memory counters, not changing the overall results.
{ The data structure used to represent branching CDDs does have provisions
to sacrice a little extra space in order to save time (while adding an extra
branching, for instance). As these options are tunable, we turned them o to
rule out that the memory overhead of CDDs was not due to them. Results
showed that the memory footprint diminished between 10 and 40% with a
author, whitin the timeframe available for this article. We expect to resolve them
for the nal version of this article.
6 Threats to Validity
The technique we used is not claimed to be denitive; it is just a research pro-
totype. Nevertheless it is important to understand that deciding aliasing based
only on hash values can lead to regarding dierent subtrees as the same. In turn,
as non-convex CDDs are used to check if a newly found zone is indeed yet un-
explored, it can be the case that it is explored again without need, because a
branch was incorrectly chopped o from the CDD. On a pathological bad case
this could jeopardize termination. Being aware of the potential problem, we still
consider our results valid because:
{ In terms of time used, the technique provides a lower bound, compared to
what should be used in a \production" setting, which will require comparing
complete subtrees or any other more involved method.
{ Calculating hashes and storing them incurs by itself in an overhead, which
we measured to be around 40% in time and around 10% in memory. That
means that resorting to the full comparison method would consume at most
10% less memory, because there might be no need to store the hashes.
{ However, if hashes are not used, and get replaced by an in-depth comparison
of subtrees, time requirements scale dramatically: all our case studies would
run out of time.
{ If the model checker does terminate, the number of explored states could be
higher than the version without compression, but equally correct.
{ Although the model checker ran out of memory in some cases, it did so by the
very size of the case study. The states-found counter showed normal values
in all cases.
{ When it did terminate, the number of states found, and when appropriate,
the trace, coincided with the version without compression.
{ There is an easy way around to get exact results: use hashes to determine
if two subtrees are possible matches, and if the hashes coincide, only then
perform an in-depth comparison. Note that this approach does not save
memory, and the runtime overhead can be between 7% and 20% on the
cases where we tried it.
{ Nevertheless, we still did all of our experimentation only using hashes, be-
cause resorting to in-depth comparisons would only increase time and pos-
sibly memory counters, not changing the overall results.
{ The data structure used to represent branching CDDs does have provisions
to sacrice a little extra space in order to save time (while adding an extra
branching, for instance). As these options are tunable, we turned them o to
rule out that the memory overhead of CDDs was not due to them. Results
showed that the memory footprint diminished between 10 and 40% with a
Page 14
proportional increase in time. The extra memory requirements for CDDs
being many orders of magnitude higher, the potential saving does not make
the dierence.
Clock ordering deserves separate mention, as an appropriate one could have
high impact in the results. Our preliminary results are non yet conclusive in this
regards, as mentioned in page 5. We leave this area pending to future research.
7 Conclusions and Future Work
In this article we revisited Clock Dierence Diagrams, a data structure that
was born to represent non-convex sets of clock valuations for model checking of
timed automata. The original presentation left some questions open, namely ex-
perimentation with bigger case studies, and clarication of the repeated subtree
detection algorithm.
To approach these questions, we evaluated memory and time performance
of three implementations for representing non-convex state sets: sets of DBMs
(which is the standard implementation), (non-convex) CDDs without aliasing
(i.e., with repeated subtrees) and (non-convex) CDDs with aliasing (i.e., without
repeated subtrees).
Regarding CDDs, as shown in Section 5, if repeated subtrees are not handled
by means of aliasing, time and memory requirements can grow by many orders
of magnitude, even though the number of states diminishes as expected in some
cases.
Our experiments conrm that although aliasing can decrease memory con-
sumption signicantly, it is not enough to make the CDD version competitive.
Even with aliasing, CDDs need orders of magnitude more memory. The original
1999 article that introduced CDDs, by Behrmann et al., showed memory savings
at the price of augmented times. Their case studies, although representative at
the time, are orders of magnitude smaller than the ones that tools handle nowa-
days. We attribute to the number of clocks used and the hundreds of thousands
of states dealt with today the dierence in outcome, as both factors contribute
to more branching in CDDs. A favorable order of clock variables could also be
responsible for the dierent outcome.
Said article did not provide details on how aliasing was handled. We found
that detecting it is costlier and less useful than expected. Our experiments still
place the non-convex, aliased CDD implementation as taking much more time
than the standard one based on sets of convexes.
Of course we cannot prove that there is not a better implementation that
correctly handles aliasing and is faster than the standard one. But after getting
involved with the details of the algorithms we haven't found convincing, detailed
evidence that it does exist. Even if it did exist, our current implementation does
achieve maximum compression, and on current days case studies, the memory
overhead is just too high. It is worth emphasizing that our implementation is
based on the same packet bit structure described in [14, Chapt. 5], which is very
ecient in space.
being many orders of magnitude higher, the potential saving does not make
the dierence.
Clock ordering deserves separate mention, as an appropriate one could have
high impact in the results. Our preliminary results are non yet conclusive in this
regards, as mentioned in page 5. We leave this area pending to future research.
7 Conclusions and Future Work
In this article we revisited Clock Dierence Diagrams, a data structure that
was born to represent non-convex sets of clock valuations for model checking of
timed automata. The original presentation left some questions open, namely ex-
perimentation with bigger case studies, and clarication of the repeated subtree
detection algorithm.
To approach these questions, we evaluated memory and time performance
of three implementations for representing non-convex state sets: sets of DBMs
(which is the standard implementation), (non-convex) CDDs without aliasing
(i.e., with repeated subtrees) and (non-convex) CDDs with aliasing (i.e., without
repeated subtrees).
Regarding CDDs, as shown in Section 5, if repeated subtrees are not handled
by means of aliasing, time and memory requirements can grow by many orders
of magnitude, even though the number of states diminishes as expected in some
cases.
Our experiments conrm that although aliasing can decrease memory con-
sumption signicantly, it is not enough to make the CDD version competitive.
Even with aliasing, CDDs need orders of magnitude more memory. The original
1999 article that introduced CDDs, by Behrmann et al., showed memory savings
at the price of augmented times. Their case studies, although representative at
the time, are orders of magnitude smaller than the ones that tools handle nowa-
days. We attribute to the number of clocks used and the hundreds of thousands
of states dealt with today the dierence in outcome, as both factors contribute
to more branching in CDDs. A favorable order of clock variables could also be
responsible for the dierent outcome.
Said article did not provide details on how aliasing was handled. We found
that detecting it is costlier and less useful than expected. Our experiments still
place the non-convex, aliased CDD implementation as taking much more time
than the standard one based on sets of convexes.
Of course we cannot prove that there is not a better implementation that
correctly handles aliasing and is faster than the standard one. But after getting
involved with the details of the algorithms we haven't found convincing, detailed
evidence that it does exist. Even if it did exist, our current implementation does
achieve maximum compression, and on current days case studies, the memory
overhead is just too high. It is worth emphasizing that our implementation is
based on the same packet bit structure described in [14, Chapt. 5], which is very
ecient in space.
Page 15
As already mentioned, clock ordering also deserves analysis. It is well known
that BDD sizes are highly sensitive to the order of variables. The same can be
expected from CDDs.
The topic of how to order clocks to diminish branching in CDDs is, in our
opinion, the path that could turn CDDs useful in the future, if any. For that
to happen, orderings that diminish branching by orders of magnitude need to
be found, to counter for the orders of magnitude more memory that branching
seems to consume.
References
1. Alur, R., Dill, D.L.: A theory of timed automata. Theoretical Computer Science
126(2) (1994) 183{235
2. Bouajjani, A., Tripakis, S., Yovine, S.: On-the-Fly Symbolic Model-Checking for
Real-Time Systems. In: 18th IEEE Real-Time Systems Symposium (RTSS `97),
San Francisco, USA, IEEE Computer Scienty Press (1997)
3. Bozga, M., Daws, C., Maler, O., Olivero, A., Tripakis, S., Yovine, S.: Kronos: A
model-checking tool for real-time systems. In: Proc. of the 10th Intl. Conf. CAV '98.
Volume 1427 of LNCS., Springer-Verlag (1998) 546{550
4. Bengtsson, J., Larsen, K.G., Larsson, F., Pettersson, P., Yi, W.: UPPAAL - a
tool suite for automatic verication of real-time systems. In: Hybrid Systems,
Springer-Verlag (1995) 232{243
5. Dill, D.L.: Timing assumptions and verication of nite-state concurrent systems.
In: International Workshop of Automatic Verication Methods for Finite State
Systems. Volume 407 of LNCS., Grenoble, France, Springer-Verlag (1990) 197{212
6. Bryant, R.E.: Graph-based algorithms for boolean function manipulation. IEEE
Transactions on Computers 35(8) (1986) 677{691
7. E. Asarin, M. Bozga, A. Kerbrat, O. Maler, M. Pnueli, A. Rasse: Data structures
for the verication of timed automata. In O. Maler, ed.: Hybrid and Real-Time
Systems. Volume 1201 of LNCS., Grenoble, France, Springer Verlag (1997) 346{360
8. K. Strehl, L. Thiele: Symbolic model checking using Interval Diagram techniques.
Technical report, Computer Engineering and Networks Lab (TIK), Swiss Federal
Institute of Technology (ETH) (1998)
9. Behrmann, G., Larsen, K.G., Pearson, J., Weise, C., Yi, W.: Ecient timed reach-
ability analysis using Clock Dierence Diagrams. In: Computer Aided Verication.
(1999) 341{353
10. Alur, R., Courcoubetis, C., Dill, D.L.: Model-checking in dense real-time. Infor-
mation and Computation 104(1) (1993) 2{34
11. Wang, F.: Ecient verication of timed automata with BDD-like data-structures.
In: VMCAI 2003: Proceedings of the 4th International Conference on Verication,
Model Checking, and Abstract Interpretation, London, UK, Springer-Verlag (2003)
189{205
12. Braberman, V., Olivero, A., Schapachnik, F.: Issues in Distributed Model-Checking
of Timed Automata: building zeus. International Journal of Software Tools for
Technology Transfer 7 (2005) 4{18
13. Larsen, K.G., Larsson, F., Pettersson, P., Yi, W.: Compact data structures and
state-space reduction for model-checking real-time systems. Real-Time Syst. 25(2-
3) (2003) 255{275
that BDD sizes are highly sensitive to the order of variables. The same can be
expected from CDDs.
The topic of how to order clocks to diminish branching in CDDs is, in our
opinion, the path that could turn CDDs useful in the future, if any. For that
to happen, orderings that diminish branching by orders of magnitude need to
be found, to counter for the orders of magnitude more memory that branching
seems to consume.
References
1. Alur, R., Dill, D.L.: A theory of timed automata. Theoretical Computer Science
126(2) (1994) 183{235
2. Bouajjani, A., Tripakis, S., Yovine, S.: On-the-Fly Symbolic Model-Checking for
Real-Time Systems. In: 18th IEEE Real-Time Systems Symposium (RTSS `97),
San Francisco, USA, IEEE Computer Scienty Press (1997)
3. Bozga, M., Daws, C., Maler, O., Olivero, A., Tripakis, S., Yovine, S.: Kronos: A
model-checking tool for real-time systems. In: Proc. of the 10th Intl. Conf. CAV '98.
Volume 1427 of LNCS., Springer-Verlag (1998) 546{550
4. Bengtsson, J., Larsen, K.G., Larsson, F., Pettersson, P., Yi, W.: UPPAAL - a
tool suite for automatic verication of real-time systems. In: Hybrid Systems,
Springer-Verlag (1995) 232{243
5. Dill, D.L.: Timing assumptions and verication of nite-state concurrent systems.
In: International Workshop of Automatic Verication Methods for Finite State
Systems. Volume 407 of LNCS., Grenoble, France, Springer-Verlag (1990) 197{212
6. Bryant, R.E.: Graph-based algorithms for boolean function manipulation. IEEE
Transactions on Computers 35(8) (1986) 677{691
7. E. Asarin, M. Bozga, A. Kerbrat, O. Maler, M. Pnueli, A. Rasse: Data structures
for the verication of timed automata. In O. Maler, ed.: Hybrid and Real-Time
Systems. Volume 1201 of LNCS., Grenoble, France, Springer Verlag (1997) 346{360
8. K. Strehl, L. Thiele: Symbolic model checking using Interval Diagram techniques.
Technical report, Computer Engineering and Networks Lab (TIK), Swiss Federal
Institute of Technology (ETH) (1998)
9. Behrmann, G., Larsen, K.G., Pearson, J., Weise, C., Yi, W.: Ecient timed reach-
ability analysis using Clock Dierence Diagrams. In: Computer Aided Verication.
(1999) 341{353
10. Alur, R., Courcoubetis, C., Dill, D.L.: Model-checking in dense real-time. Infor-
mation and Computation 104(1) (1993) 2{34
11. Wang, F.: Ecient verication of timed automata with BDD-like data-structures.
In: VMCAI 2003: Proceedings of the 4th International Conference on Verication,
Model Checking, and Abstract Interpretation, London, UK, Springer-Verlag (2003)
189{205
12. Braberman, V., Olivero, A., Schapachnik, F.: Issues in Distributed Model-Checking
of Timed Automata: building zeus. International Journal of Software Tools for
Technology Transfer 7 (2005) 4{18
13. Larsen, K.G., Larsson, F., Pettersson, P., Yi, W.: Compact data structures and
state-space reduction for model-checking real-time systems. Real-Time Syst. 25(2-
3) (2003) 255{275
Page 16
14. Schapachnik, F.: Timed Automata Model Checking in Monoprocessor and Multi-
processor Architectures. Phd thesis, Departamento de Computacion, Facultad de
Ciencias Exactas y Naturales, Universidad de Buenos Aires (2007)
15. Yovine, S.: Model checking timed automata. In Rozenberg, G., Vaandrager, F.,
eds.: Embedded Systems. Volume 1494 of LNCS., Springer-Verlag (1997)
16. Pavese, E.: A new data structure based on BDDs for the model check-
ing of timed systems. Degree thesis, Departamento de Computacion, Fac-
ultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (2006)
(http://lafhis.dc.uba.ar/~epavese/tesis_pavese.ps.gz).
17. Pavese, E., Schapachnik, F.: Relaxed Clock Dierence Diagrams for Timed Au-
tomata Model Checking. Technical report, Departamento de Computacion, Facul-
tad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (2007)
18. Clarke, E., Grumberg, O., Peled, D.A.: Model Checking. The MIT Press. Cam-
bridge, Massachusetts (1999)
19. Alur, R., Courcoubetis, C., Dill, D., Halbwachs, N., Wong-Toi, H.: An implemen-
tation of three algorithms for timing verication based on automata emptiness. In:
Proceedings of the 13th IEEE Real-time Systems Symposium, Phoenix, Arizona
(1992) 157{166
20. Elseaidy, W., Cleaveland, R., Jr., J.B.: Modeling and verifying active structural
control systems. Science of Computer Programming 29(1-2) (1997) 99{122
21. Braberman, V., Garbervetsky, D., Olivero, A.: ObsSlice: A timed automata slicer
based on observers. In: Proc. of the 16th Intl. Conf. CAV '04. LNCS, Springer
Verlag (2004)
22. Beyer, D., Noack, A.: Can decision diagrams overcome state space explosion in
real-time verication? In Konig, H., Heiner, M., Wolisz, A., eds.: FORTE. Volume
2767 of Lecture Notes in Computer Science., Springer (2003) 193{208
processor Architectures. Phd thesis, Departamento de Computacion, Facultad de
Ciencias Exactas y Naturales, Universidad de Buenos Aires (2007)
15. Yovine, S.: Model checking timed automata. In Rozenberg, G., Vaandrager, F.,
eds.: Embedded Systems. Volume 1494 of LNCS., Springer-Verlag (1997)
16. Pavese, E.: A new data structure based on BDDs for the model check-
ing of timed systems. Degree thesis, Departamento de Computacion, Fac-
ultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (2006)
(http://lafhis.dc.uba.ar/~epavese/tesis_pavese.ps.gz).
17. Pavese, E., Schapachnik, F.: Relaxed Clock Dierence Diagrams for Timed Au-
tomata Model Checking. Technical report, Departamento de Computacion, Facul-
tad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (2007)
18. Clarke, E., Grumberg, O., Peled, D.A.: Model Checking. The MIT Press. Cam-
bridge, Massachusetts (1999)
19. Alur, R., Courcoubetis, C., Dill, D., Halbwachs, N., Wong-Toi, H.: An implemen-
tation of three algorithms for timing verication based on automata emptiness. In:
Proceedings of the 13th IEEE Real-time Systems Symposium, Phoenix, Arizona
(1992) 157{166
20. Elseaidy, W., Cleaveland, R., Jr., J.B.: Modeling and verifying active structural
control systems. Science of Computer Programming 29(1-2) (1997) 99{122
21. Braberman, V., Garbervetsky, D., Olivero, A.: ObsSlice: A timed automata slicer
based on observers. In: Proc. of the 16th Intl. Conf. CAV '04. LNCS, Springer
Verlag (2004)
22. Beyer, D., Noack, A.: Can decision diagrams overcome state space explosion in
real-time verication? In Konig, H., Heiner, M., Wolisz, A., eds.: FORTE. Volume
2767 of Lecture Notes in Computer Science., Springer (2003) 193{208
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
1 Reader on Mendeley
by Discipline
by Academic Status
100% Ph.D. Student
by Country
100% Argentina


