Sign up & Download
Sign in

Temporal Dependency based Checkpoint Selection for Dynamic Verification of Temporal Constraints in Scientific Workflow Systems

by Jinjun Chen, Y U N Yang
ACM Transactions on Software Engineering and Methodology (2011)

Abstract

In a scientific workflow system, a checkpoint selection strategy is used to select checkpoints along scientific workflow execution for verifying temporal constraints so that we can identify any temporal violations and handle them in time in order to ensure overall temporal correctness of the execution that is often essential for the usefulness of execution results. The problem of existing representative strategies is that they do not differentiate temporal constraints as, once a checkpoint is selected, they verify all temporal constraints. However, such a checkpoint does not need to be taken for those constraints whose consistency can be deduced from others. The corresponding verification of such constraints is consequently unnecessary and can severely impact overall temporal verification efficiency while the efficiency determines whether temporal violations can be identified quickly for handling in time. To address the problem, in this article, we develop a new temporal-dependency based checkpoint selection strategy which can select checkpoints in accordance with different temporal constraints. With our strategy, the corresponding unnecessary verification can be avoided. The comparison and experimental simulation further demonstrate that our new strategy can improve the efficiency of overall temporal verification significantly over the existing representative strategies.

Cite this document (BETA)

Available from dl.acm.org
Page 1
hidden

Temporal Dependency based Checkpoint Selection for Dynamic Verification of Temporal Constraints in Scientific Workflow Systems

1
Temporal Dependency based Checkpoint
Selection for Dynamic Verification of Temporal
Constraints in Scientific Workflow Systems*
JINJUN CHEN, YUN YANG
Swinburne University of Technology
________________________________________________________________________

In a scientific workflow system, a checkpoint selection strategy is used to select checkpoints along scientific
workflow execution for verifying temporal constraints so that we can identify any temporal violations and
handle them in time in order to ensure overall temporal correctness of the execution which is often essential for
the usefulness of execution results. The problem of existing representative strategies is that they do not
differentiate temporal constraints as once a checkpoint is selected, they verify all temporal constraints. However,
such checkpoint does not need to be taken for those constraints whose consistency can be deduced from others.
The corresponding verification of such constraints is consequently unnecessary and can severely impact overall
temporal verification efficiency while the efficiency determines whether temporal violations can be identified
quickly for handling in time. To address the problem, in this paper, we develop a new temporal dependency
based checkpoint selection strategy which can select checkpoints according to different temporal constraints.
With our strategy, the corresponding unnecessary verification can be avoided. The comparison and
experimental simulation further demonstrate that our new strategy can improve the efficiency of overall
temporal verification significantly over the existing representative strategies.

Categories and Subject Descriptors: D.2.4 [Software Engineering]: Software/Program Verification
General Terms: Algorithms, Design, Reliability, Theory, Verification
Additional Key Words and Phrases: Scientific workflows, temporal constraints, temporal verification,
checkpoint selection
________________________________________________________________________


1. INTRODUCTION
From the perspective of software engineering, a scientific workflow system is a type of
scientific software in the area of Software Engineering for Computational Science and
Engineering which is achieving increasing attention from software engineering researchers
[Ludäscher et al. 2006, Seces 2008]. It is responsible for modelling and executing large-
scale sophisticated scientific workflows existing in a variety of complex computation and
data intensive applications such as astrophysics, climate modelling and earthquake
simulation [Abramson et al. 2005, Daisuke et al. 2007, Gil et al. 2007, Mandal et al. 2007,
Taylor et al. 2007]. A scientific workflow normally contains a large number of computation
and data intensive activities [Deelman and Chervenak 2008, Maechling et al. 2005, Oinn
________________________________________________________________________
This research is partly supported by Australian Research Council Discovery Project under grant No.
DP0663841, and Australian Research Council Linkage Project under grant No. LP0990393.
Authors' addresses: J. Chen and Y. Yang, Faculty of Information and Communication Technologies, Swinburne
University of Technology, PO Box 218, Hawthorn, Melbourne, Australia 3122; email: {jchen;
yyang}@swin.edu.au.

Permission to make digital/hard copy of part of this work for personal or classroom use is granted without fee
provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice,
the title of the publication, and its date of appear, and notice is given that copying is by permission of the ACM,
Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific
permission and/or a fee.
© 2008 ACM 1073-0516/01/0300-0034 $5.00
* A preliminary version of this paper was published in the 30th International Conference on Software Engineering
(ICSE2008), Leipzig, Germany, 10-18 May 2008. See Ref [Chen and Yang 2008b]. The work reported in this
paper is a significant extension and generalisation for all types of temporal constraints.
Page 2
hidden
2
et al. 2006, Prodan and Fahringer 2008]. One of the software engineering research issues in
developing a scientific workflow system is temporal verification which is to identify any
temporal violations in scientific workflow specifications and executions [Chen and Yang
2008a, Yu and Buyya 2005].

1.1 Temporal Constraints
In reality, a scientific workflow is normally time constrained [Brandic et al. 2008, Pandey
and Buyya 2008] because temporal correctness, i.e. whether the scientific workflow can be
completed on time, is essential to ensure the usefulness of its execution results. For example,
an astrophysics scientific workflow for detecting the existence of gravitational wave in a
signal channel is a time-critical real-time streaming data application [Daisuke et al. 2007].
Even a small delay in its completion can result in the omission of detection due to its
real-time and streaming nature. Then, several more years may be needed for another
wave to appear, but the worse consequence is that this would not be known since we are
not aware of the omission due to the time delay. As a result, all completed and continuing
costly computation including expensive use of supercomputing facilities for thousands of
hours becomes useless and hence a big economical loss1. Taking the supercomputer
hosted and funded by Swinburne University of Technology in Australia [SwinSuper 2009]
as an example, the charge for a use of 24 hours is about US$20000. For the use of
thousands of hours, we can imagine how much the economical loss could be. As such,
temporal constraints should be set in scientific workflow specifications to enable the control
and monitoring of temporal correctness during execution. The types of temporal constraints
mainly include: upper bound, lower bound and fixed-time [Chen and Yang 2008b, Eder et
al. 1999]. An upper bound constraint between two activities is a relative time value so
that the duration between them must be less than or equal to it. A lower bound constraint
between two activities is a relative time value so that the duration between them must be
greater than or equal to it. A fixed-time constraint at an activity is an absolute time value
such as 6:00pm by which the activity must be completed.
Comparing the three types of temporal constraints, we can find that conceptually a
lower bound constraint is symmetrical to an upper bound constraint while a fixed-time
constraint is a special case of upper bound constraint. The reasons are as follows. For a
lower bound constraint, we often check whether the duration between its start and end
activities is greater than or equal to (≥) its value while for an upper bound constraint, we
often check whether the duration between its start and end activities is less than or equal
to (≤) its value. Therefore, they are symmetrical to each other. As for a fixed-time
constraint, the first activity of a scientific workflow is actually its start activity. Hence, a
fixed-time constraint can be viewed as a special upper bound constraint whose start
activity is the first activity and whose end activity is the one at which the fixed-time
constraint is. Nevertheless, an upper bound constraint is conceptually more general than a
fixed-time constraint as its start activity can be an intermediate activity rather than the
first activity. Besides, different upper bound constraints can have different start activities
while all fixed-time constraints have the same start activity which is the first activity.
As such, in this paper, we focus on upper bound constraints only. The corresponding
discussion and results can be symmetrically applied to lower bound constraints and
adaptively simplified for fixed-time constraints.




1 In fact, there is also a loss of scientific discovery of the gravitational wave. However, as it is not relevant to the
scope of this paper, we do not discuss it further.

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

6 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
50% Ph.D. Student
 
17% Student (Master)
 
17% Researcher (at a non-Academic Institution)
by Country
 
33% United States
 
17% United Kingdom
 
17% Australia