Using Mutation for the Assessment and Optimization of Tests and Properties
Abstract
We are interested in exploring the complementary relationship and tradeoffs between testing and property-based analysis with respect to bug detection. In this paper we present an empirical approach to the assessment of testing and property based analysis tools using metrics to measure the quantity and efficiency of each technique at finding bugs. We have implemented our approach in an assessment component that has been constructed to allow for symmetrical comparison and evaluation of tests versus properties. In addition to assessing test cases and properties we are also interested in using each to optimize the other as well as to develop hybrid quality assurance approaches. We hypothesize that the synergies of using testing and property-based analysis in combination will allow for optimizations in test suites and property sets that are not possible by using both approaches in isolation.
Using Mutation for the Assessment and Optimization of Tests and Properties
Tests and Properties∗
Jeremy S. Bradbury
School of Computing, Queen’s University
Kingston, Ontario, Canada
bradbury@cs.queensu.ca
ABSTRACT
We are interested in exploring the complementary relation-
ship and tradeoffs between testing and property-based anal-
ysis with respect to bug detection. In this paper we present
an empirical approach to the assessment of testing and property-
based analysis tools using metrics to measure the quantity
and efficiency of each technique at finding bugs. We have
implemented our approach in an assessment component that
has been constructed to allow for symmetrical comparison
and evaluation of tests versus properties. In addition to
assessing test cases and properties we are also interested
in using each to optimize the other as well as to develop
hybrid quality assurance approaches. We hypothesize that
the synergies of using testing and property-based analysis in
combination will allow for optimizations in test suites and
property sets that are not possible by using both approaches
in isolation.
Keywords
testing, formal analysis, test suite assessment, property as-
sessment, empirical software engineering.
1. INTRODUCTION
The goal of the proposed research work is to increase the
quality assurance of software systems by exploiting the syn-
ergies that exist between testing and property-based analy-
sis. Our interest in exploring the complementary relation-
ship between testing and property-based analysis is moti-
vated on the one hand by advances in the theory and prac-
tise of property-based analysis, especially formal analysis,
and on the other hand, by a need for improved quality assur-
ance techniques for industrial code – especially concurrent
code. We believe that the property-based formal analysis
tools that are now available offer the potential to substan-
tially aid in the debugging of industrial concurrent code.
∗This work was supported by the Natural Sciences and En-
gineering Research Council of Canada (NSERC)
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00.
Intuitively, the detection of a property or assertion viola-
tion, such as a violation of a method pre-condition, a loop
invariant, a class representation invariant, an interface us-
age rule, or a temporal property should be more insightful
than the failure of a possibly global test case.
A shift in the focus of formal methods from proofs of cor-
rectness to debugging and testing has been advocated by
a number of researchers including Rushby [20]. In recent
years, tool development in the formal analysis community
has matured and the current generation of tools are auto-
matic, scalable, and only leave a small semantic gap between
the source artifacts used by developers and the model arti-
facts required for analysis. The ability to directly analyze
source code and the increase in size of systems that can be
analyzed has helped formal analysis become a viable option
for software debugging.
While the majority of software systems currently devel-
oped in industry are single-threaded sequential programs,
there is mounting evidence that “applications will increas-
ingly need to be concurrent if they want to fully exploit CPU
throughput gains that have now started becoming avail-
able and will continue to materialize over the next several
years” [21]. The shift from sequential to concurrent systems
provides an opportunity for the application of formal analy-
sis techniques which can often succeed at debugging concur-
rent systems while testing in this setting is often insufficient
or impractical.
2. HYPOTHESIS
Using a mutation-based approach to testing and
property-based analysis will allow for the assess-
ment and comparison of test suites and property
sets. Furthermore, the synergies of using testing
and formal analysis in combination will allow for
optimizations in test suites and property sets that
are not possible by using both approaches in isola-
tion. These optimizations can be integrated effi-
ciently into a modern software quality assurance
process and can help to increase the effectiveness
and efficiency of software quality assurance pro-
cesses.
The term assessment in our hypothesis refers to the statis-
tical evaluation of a test suite or property set using mutation
testing metrics. Mutation testing uses mutation operators
to generate faulty versions of the original program called mu-
tants. If we assume the original program as being correct
then a mutant version that is non-equivalent can be thought
tants detected (killed) by a test suite or property set is the
mutant score. We have chosen to use a mutation metric
because a recent study found that for the programs being
studied, mutant faults were a good measure of real faults [2].
The term optimization refers to two activities: generation
and reduction. On the one hand, test cases and properties
are generated to allow a given test suite or property set
to achieve a better evaluation using the mutation testing
metric. On the other hand, test suites and property sets are
reduced when the removal of test cases and property sets
will not reduce the evaluation result using the metric.
3. RESEARCH APPROACH
Our research approach consists of two phases. The first
phase is the assessment phase where we empirically evalu-
ate and explore the synergies between testing and property-
based analysis. The second phase is the optimization phase
where we take the results from out empirical assessment and
try to use it to enhance the quality assurance of a given pro-
gram by improving the test suite and property set as well
as develop hybrid techniques. Figure 1 shows our proposed
framework to support our research approach. The current
state of the framework is that the assessment component has
been completed but the optimization components have not.
We will now outline the assessment phase and the optimiza-
tion phase in more detail. Due to space we will not discuss
our experimental setup which was presented in a previous
paper [4].
3.1 Assessment Phase
To support the experimental procedure we have devel-
oped the assessment component of our framework such that
it features a high-degree of automation and customizabil-
ity and thus allows for a large number of experiments to be
carried out as efficiently as possible. Our assessment com-
ponent essentially consists of a Java application that acts as
a wrapper to all of the other tools and scripts used. The
framework is generic enough to allow for the comparison of
multiple testing and property-based analysis approaches.
For simplicity we will only explain the assessment frame-
P
r
o
p
e
r
t
y
-
B
a
s
e
d
A
n
a
l
y
s
i
s
T
e
s
t
i
n
g
T
X
L
M
u
t
a
n
t
G
e
n
e
r
a
t
o
r
s
P
r
o
p
e
r
t
i
e
s
P
r
o
p
e
r
t
i
e
s
T
e
s
t
C
a
s
e
s
T
e
s
t
S
u
i
t
e
O
p
t
i
m
i
z
a
t
i
o
n
C
o
m
p
o
n
e
n
t
P
r
o
p
e
r
t
y
S
e
t
O
p
t
i
m
i
z
a
t
i
o
n
C
o
m
p
o
n
e
n
t
P
r
o
p
e
r
t
i
e
s
P
r
o
p
e
r
t
i
e
s
P
r
o
p
e
r
t
i
e
s
A
s
s
e
s
s
m
e
n
t
C
o
m
p
o
n
e
n
t
C
o
l
l
e
c
t
i
o
n
a
n
d
D
i
s
p
l
a
y
o
f
R
e
s
u
l
t
s
O
r
i
g
i
n
a
l
S
o
u
r
c
e
C
o
d
e
R
O
R
S
D
L
A
B
S
.
.
.
M
u
t
a
n
t
S
o
u
r
c
e
C
o
d
e
A
s
s
e
s
s
m
e
n
t
R
e
s
u
l
t
s
D
a
t
a
b
a
s
e
Figure 1: Proposed Assessment and Optimization
Framework
Assessment Results Reported by the Framework
T
e
s
t
s
x Mutant score for each test case/test suite
x Execution cost for each test case/test suite
x Number of test cases that kill each mutant
P
r
o
p
e
r
t
i
e
s
x Mutant score for each property/property set
x Execution cost for each property/property set
x Number of properties that kill each mutant
x Mutant score for each property pattern type
x Types of mutants killed by each property pattern type
I
n
t
e
g
r
a
t
i
n
g
t
e
s
t
s
&
p
r
o
p
e
r
t
i
e
s
x Hybrid set of tests and properties that achieve the
highest mutant score
x Hybrid set of tests and properties that achieve a certain
mutant score (e.g. 90%) and
ohas the lowest execution cost
ohas the smallest set of tests and properties
Table 1: Types of results collected and reported
work in the context of comparing one testing technique (e.g.,
concurrent testing using a randomized scheduler with Con-
Test [8]) with one property-based analysis technique (e.g.,
static analysis approach using Path Inspector [1]) or for-
mal analysis approach using Bandera [6]/Bogor [19]). When
comparing testing with a property-based analysis the com-
ponent requires as input a program (e.g., written in C, C++,
Java) and an accompanying test suite and property set.
Once the inputs have been selected, the assessment com-
ponent implements four main steps in the experimental pro-
cedure:
1. Mutant generation: A built-in set of mutation opera-
tors can be selected individually to allow for the gen-
eration of mutant programs to be customized.
2. Property-based Analysis: Our application calls an au-
tomatically generated script which allows all of the
property-based analyses to be performed automatically.
For our set of properties we first evaluate each prop-
erty using the original program to determine the ex-
pected outputs. Next we evaluate our property set
for all of the mutant versions of the original program.
During property-based analysis all of the verification
results, generated counter-examples, and analysis ex-
ecution times of each property with each program are
recorded.
3. Testing: Our application calls an automatically gen-
erated script which compiles the source code and exe-
cutes the testing, recording the output result and exe-
cution time for each test case with each program.
4. Collection and display of results: We compare the ex-
ecution and analysis results of the original program
with the results of executing and analyzing the mu-
tant programs to see if each test case and property
was able to distinguish the mutant programs from the
original (see Table 1).
Currently, we have implemented all four steps of the ex-
perimental procedure in the assessment component and sam-
ple screenshots of the component are given in Figure 2. We
have made the component customizable and flexible to sup-
port a wide range of experiments using different programs,
languages, tools, and properties. For example, we plan to
compare different property-based analysis techniques and
compare different types of properties (e.g. assertions vs.
LTL properties). We will discuss the specific experiments
we are currently planning in Section 4.
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


