Metadata Management in the Taverna Workflow System
Abstract
There seems to be a general consensus on the crucial role metadata can play for enhancing the functionalities of scientific workflows systems, e.g., workflow and service discovery, composition and provenance browsing, among others. However, in most cases their management is under-specified, if not left unaddressed at all. A step in this direction, the main contribution of the work presented in this paper is an overview of metadata and their management in the Taverna workflow system. In Taverna, we consider metadata to be a first class citizen in the system, in the sense that we fully cover their life cycle from their creation, through their use and curation until their eventual removal. We present the main steps of this cycle and present the models used for metadata specification. In doing so, we distinguish two classes of metadata: metadata that describe workflow related entities, such as services, workflows and subworkflows, and metadata that describe workflow executions, also known as workflow provenance.
Metadata Management in the Taverna Workflow System
Khalid Belhajjame⋆, Katy Wolstencroft⋆, Oscar Corcho⋆, Tom Oinn†,
Franck Tanoh⋆, Alan William⋆ and Carole Goble⋆
⋆School of Computer Science
University of Manchester, Oxford Road
Manchester, M13 9PL, UK
Khalid.Belhajjame@cs.man.ac.uk
†EMBL European Bioinformatics Institute,
Hinxton, Cambridge CB10 1SD, UK
tmo@ebi.ac.uk
Abstract
There seems to be a general consensus on
the crucial role metadata can play for enhancing
the functionalities of scientific workflows systems,
e.g., workflow and service discovery, composition
and provenance browsing, among others. How-
ever, in most cases their management is under-
specified, if not left unaddressed at all. A step in
this direction, the main contribution of the work
presented in this paper is an overview of metadata
and their management in the Taverna workflow
system. In Taverna, we consider metadata to be a
first class citizen in the system, in the sense that
we fully cover their life cycle from their creation,
through their use and curation until their eventual
removal. We present the main steps of this cycle
and present the models used for metadata speci-
fication. In doing so, we distinguish two classes
of metadata: metadata that describe workflow re-
lated entities, such as services, workflows and sub-
workflows, and metadata that describe workflow
executions, also known as workflow provenance.
1 Introduction
Key to the realisation of the semantic web vision
are metadata that describe available resources.
Metadata are generally defined as structured data
about an object that supports functions associated
with the designated object [6]. In our case, meta-
data are used to describe workflow related entities
with the objective to enhance the potential of the
applications that make use of them either inter-
nally, that is within the workflow system, or exter-
nally, i.e., by third party applications. For exam-
ple, using metadata that describe a workflow, its
constituent processors, i.e., the steps that compose
the workflow, the services invoked as a result of
processors’ enactment and processors’ dependen-
cies users may be able to know the scientific value
of the experiment implemented by the workflow,
the tasks performed by each step in the workflow
as well as debugging mismatches by analysing pro-
cessors’ dependencies.
Commonly, metadata are specified using anno-
tations which associate resources to their respec-
tive descriptions. In its simplistic form, annota-
tions can be textual descriptions or lists of key-
words. However, to enable their use by machines,
as well as humans, a more controlled annotation
mechanism should be employed for their specifica-
tion. For example, annotations can be encoded in
the form of associations that relate the annotated
resources to concepts and properties defined in on-
tologies. An ontology is described as an explicit
specification of a shared conceptualisation [7]. An
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


