Grid metadata management: Requirements and architecture
2007 8th IEEEACM International Conference on Grid Computing (2007)
- ISBN: 9781424415601
- DOI: 10.1109/GRID.2007.4354121
Available from 2007 8th IEEEACM International Conference on Grid Computing
or
Abstract
Metadata annotations of grid resources can potentially be used for a number of purposes, including accurate resource allocation to jobs, discovery of services, and precise retrieval of information resources. In order to realize this potential on a large scale, various aspects of metadata must be managed. These include uniform and secure access to distributed and independently maintained metadata repositories, as well as management of metadata lifecycle. In this paper we analyze these issues and present a service-oriented architecture for metadata management, called S-OGSA, that addresses them in a systematic way.
Page 1
Grid metadata management: Require...
Grid Metadata Management:
architecture
Oscar Corcho, Pinar Alper, Paolo Missier, Sean Be
School of Computer Science, University of
Oxford Road, Manchester M13 9PL, Unit
{ocorcho,penpecip,pmissier,seanb,carole
Abstract—Metadata annotations of Grid resources can poten-
tially be used for a number of purposes, including accurate
resource allocation to jobs, discovery of services, and precise
retrieval of information resources. In order to realize this
potential on a large scale, various aspects of metadata must be
managed. These include uniform and secure access to distributed
and independently maintained metadata repositories, as well as
management of metadata lifecycle. In this paper we analyze these
issues and present a service-oriented architecture for metadata
management, called S-OGSA, that addresses them in a systematic
way.
I. INTRODUCTION
Successful, large-scale management of Grid resources, i.e.,
data and services, increasingly involves the use of annotations
to describe various aspects of those resources, be it the
availability of computing or memory resources, or the function
and interface to a Grid service. With the term “metadata” we
denote, in a broad sense, any data that describes resources,
and more specifically, “structured data about an object that
supports functions associated with the designated object” [1].
Thus, metadata is structured according to some schema, and
it is used to provide a functional or behavioral description of
objects, or resources.
Annotations of resources may potentially serve a multitude
of purposes, from the correct allocation of computational
resources to jobs, to discovery of services, to accurate retrieval
of information resources. The adoption of the Semantic Web
paradigm in particular with its associated standard languages
and technologies for metadata and knowledge representation
(i.e. RDF(S) [2] and OWL [3]) has been viewed as the key
enabler of the annotation of Grid resources and the exploitation
of this mark-up. According to this vision, dubbed the Semantic
Grid 1, describing various aspects of Grid resources in terms
of agree-upon formal ontologies makes it possible to generate
“semantic” annotations that can be interpreted using concepts
from those ontologies. This results in annotations that are
predictable both in structure and in content, without being too
rigid. This makes them more easily interoperable than free-
format metadata with arbitrary content. Furthermore, formal
techniques for automated reasoning may sometimes be lever-
aged to enhance the effectiveness of resource management
tasks, like those listed earlier. A number of annotation tools (a
survey can be found in [4]) are currently available to produce
1http://www.semanticgrid.org
metadata, and
those annotati
Annotea, Tec
With this re
tions in the G
of large-scale
this paper, we
that takes int
form access t
organizations;
uniform autho
generation of
Firstly, we
ment issues
provides a ri
the bioinform
some of the e
used to addre
tion is a nove
resources in
tion V). Desig
architecture, S
large-scale, un
presenting the
Service” (SB
architecture a
A prototype v
is deployed a
service contai
II. A
MA
The myGri
of metadata m
OGSA. myG
middleware t
ing scientific
databases and
In this con
and gene prod
in order to pr
2http://www.m
1-4244-1560-8/07/$25.00 © 2007 IEEE 97requirements and
chhofer, Carole Goble
Manchester
ed Kingdom
}@cs.man.ac.uk
various technologies can be used to manage
ons, including Jena, Sesame, Boca, Oracle-RDF,
hnorati, etc.
cognition of the importance of metadata annota-
rid environment, however, also come new issues
metadata access, interoperation, and reuse. In
argue for a metadata management architecture
o account the specific requirements of (i) uni-
o distributed metadata produced by independent
(ii) metadata lifecycle management; and (iii)
risation mechanisms, in order to support a new
metadata-intensive applications.
present some of the known metadata manage-
in the context of the myGrid project2, which
ch service-based middleware infrastructure for
atics domain (Section II). Secondly, we analyze
xisting approaches and technologies that can be
ss these issues (Section IV). Our main contribu-
l proposal for managing metadata as first-class
distributed systems, known as S-OGSA (Sec-
ned as a non-disruptive extension to the OGSA
-OGSA provides a service-oriented approach to
iform metadata management on the Grid. After
core S-OGSA service, called “Semantic Binding
S), we conclude by arguing that the proposed
ddresses the management issues listed above.
ersion of the SBS has been implemented, and
s a Grid service within the Globus Toolkit v.4
ner.
MOTIVATIONAL EXAMPLE: METADATA
NAGEMENT IN THE myGRID PROJECT
d project provides a good example for the type
anagement requirements that we address in S-
rid provides bioinformaticians with a suite of
ools and services for assembling and execut-
workflows that involve access to a variety of
data analysis tools.
text, metadata is pervasive: on one hand, genes
ucts are annotated, possibly by human curators,
ovide a rich description of function or structure.
ygrid.org.uk/
8th Grid Computing Conference
architecture
Oscar Corcho, Pinar Alper, Paolo Missier, Sean Be
School of Computer Science, University of
Oxford Road, Manchester M13 9PL, Unit
{ocorcho,penpecip,pmissier,seanb,carole
Abstract—Metadata annotations of Grid resources can poten-
tially be used for a number of purposes, including accurate
resource allocation to jobs, discovery of services, and precise
retrieval of information resources. In order to realize this
potential on a large scale, various aspects of metadata must be
managed. These include uniform and secure access to distributed
and independently maintained metadata repositories, as well as
management of metadata lifecycle. In this paper we analyze these
issues and present a service-oriented architecture for metadata
management, called S-OGSA, that addresses them in a systematic
way.
I. INTRODUCTION
Successful, large-scale management of Grid resources, i.e.,
data and services, increasingly involves the use of annotations
to describe various aspects of those resources, be it the
availability of computing or memory resources, or the function
and interface to a Grid service. With the term “metadata” we
denote, in a broad sense, any data that describes resources,
and more specifically, “structured data about an object that
supports functions associated with the designated object” [1].
Thus, metadata is structured according to some schema, and
it is used to provide a functional or behavioral description of
objects, or resources.
Annotations of resources may potentially serve a multitude
of purposes, from the correct allocation of computational
resources to jobs, to discovery of services, to accurate retrieval
of information resources. The adoption of the Semantic Web
paradigm in particular with its associated standard languages
and technologies for metadata and knowledge representation
(i.e. RDF(S) [2] and OWL [3]) has been viewed as the key
enabler of the annotation of Grid resources and the exploitation
of this mark-up. According to this vision, dubbed the Semantic
Grid 1, describing various aspects of Grid resources in terms
of agree-upon formal ontologies makes it possible to generate
“semantic” annotations that can be interpreted using concepts
from those ontologies. This results in annotations that are
predictable both in structure and in content, without being too
rigid. This makes them more easily interoperable than free-
format metadata with arbitrary content. Furthermore, formal
techniques for automated reasoning may sometimes be lever-
aged to enhance the effectiveness of resource management
tasks, like those listed earlier. A number of annotation tools (a
survey can be found in [4]) are currently available to produce
1http://www.semanticgrid.org
metadata, and
those annotati
Annotea, Tec
With this re
tions in the G
of large-scale
this paper, we
that takes int
form access t
organizations;
uniform autho
generation of
Firstly, we
ment issues
provides a ri
the bioinform
some of the e
used to addre
tion is a nove
resources in
tion V). Desig
architecture, S
large-scale, un
presenting the
Service” (SB
architecture a
A prototype v
is deployed a
service contai
II. A
MA
The myGri
of metadata m
OGSA. myG
middleware t
ing scientific
databases and
In this con
and gene prod
in order to pr
2http://www.m
1-4244-1560-8/07/$25.00 © 2007 IEEE 97requirements and
chhofer, Carole Goble
Manchester
ed Kingdom
}@cs.man.ac.uk
various technologies can be used to manage
ons, including Jena, Sesame, Boca, Oracle-RDF,
hnorati, etc.
cognition of the importance of metadata annota-
rid environment, however, also come new issues
metadata access, interoperation, and reuse. In
argue for a metadata management architecture
o account the specific requirements of (i) uni-
o distributed metadata produced by independent
(ii) metadata lifecycle management; and (iii)
risation mechanisms, in order to support a new
metadata-intensive applications.
present some of the known metadata manage-
in the context of the myGrid project2, which
ch service-based middleware infrastructure for
atics domain (Section II). Secondly, we analyze
xisting approaches and technologies that can be
ss these issues (Section IV). Our main contribu-
l proposal for managing metadata as first-class
distributed systems, known as S-OGSA (Sec-
ned as a non-disruptive extension to the OGSA
-OGSA provides a service-oriented approach to
iform metadata management on the Grid. After
core S-OGSA service, called “Semantic Binding
S), we conclude by arguing that the proposed
ddresses the management issues listed above.
ersion of the SBS has been implemented, and
s a Grid service within the Globus Toolkit v.4
ner.
MOTIVATIONAL EXAMPLE: METADATA
NAGEMENT IN THE myGRID PROJECT
d project provides a good example for the type
anagement requirements that we address in S-
rid provides bioinformaticians with a suite of
ools and services for assembling and execut-
workflows that involve access to a variety of
data analysis tools.
text, metadata is pervasive: on one hand, genes
ucts are annotated, possibly by human curators,
ovide a rich description of function or structure.
ygrid.org.uk/
8th Grid Computing Conference
Readership Statistics
8 Readers on Mendeley
by Discipline
by Academic Status
25% Researcher (at an Academic Institution)
13% Lecturer
13% Student (Master)
by Country
25% United States
13% Switzerland
13% United Kingdom
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime






