Towards a Semantically Unified Environmental Information Space
Abstract
In recent years we have witnessed a proliferation of environmental information on the Web thanks to advances in automated data acquisition and to the widespread use of computer based models and decision support systems processing environmental data. The number of environmental data providers has been also increasing. However, each provider manages its own data sets encoded into specific data formats and unaware of related and relevant data managed by other providers. Also, most of the environmental data providers store their data into huge, centralized repositories, which makes the access and discovery of desired data difficult. The Linked Data principles along with the Semantic Web technologies have been recognized as a promising solution to both environmental data integration and discovery. Unique identification of environmental data by HTTP dereferencable URIs, semantic annotation of environmental data by shared domain conceptualizations (ontologies), and interlinking of related environmental data by typed (semantic) links will enable the integration of disconnected environmental data sets into the semantically unified environmental information space. Semantic annotations and semantic links will then enable semantic discovery of environmental data over such unified information space. In this paper, we try to identify a number of requirements that environmental data providers should satisfy in order to make their data fully contribute to this vision. In particular, we are focused on requirements regarding environmental data identification, representation, annotation and linking.
Author-supplied keywords
Towards a Semantically Unified Environmental Information Space
Information Space
Sasa Nesic1, Andrea Emilio Rizzoli1, and Ioannis N. Athanasiadis2
1IDSIA, Manno, Switzerland
{sasa,andrea}@idsia.ch
2Democritus University of Thrace, Xanthi, Greece
ioannis@athanasiadis.info
Abstract. In recent years we have witnessed a proliferation of environ-
mental information on the Web thanks to advances in automated data
acquisition and to the widespread use of computer based models and
decision support systems processing environmental data. The number of
environmental data providers has been also increasing. However, each
provider manages its own data sets encoded into specic data formats
and unaware of related and relevant data managed by other providers.
Also, most of the environmental data providers store their data into
huge, centralized repositories, which makes the access and discovery of
desired data dicult. The Linked Data principles along with the Se-
mantic Web technologies have been recognized as a promising solution
to both environmental data integration and discovery. Unique identi-
cation of environmental data by HTTP dereferencable URIs, semantic
annotation of environmental data by shared domain conceptualizations
(ontologies), and interlinking of related environmental data by typed (se-
mantic) links will enable the integration of disconnected environmental
data sets into the semantically unied environmental information space.
Semantic annotations and semantic links will then enable semantic dis-
covery of environmental data over such unied information space. In this
paper, we try to identify a number of requirements that environmental
data providers should satisfy in order to make their data fully contribute
to this vision. In particular, we are focused on requirements regarding
environmental data identication, representation, annotation and link-
ing.
Keywords: environmental data identication, semantic annotation, se-
mantic linking.
1 Introduction
Over recent years, the adoption of the Linked Data best practices for publish-
ing and collecting structured data on the Web [6] has opened the possibility of
creating a unied information space, connecting data from dierent sources and
domains such as weather forecasts, music stores, television and radio programs,
on-line communities and business records. This information space is commonly
refereed as a Linked Open Data (LOD) Cloud and is considered as an incubator
for the envisioned Web of Data. The main idea of the Web of Data is linking data
instead of linking documents, which should enabling ne-grained integration of
cross-domain information into a globally unied information space [3]. Moreover,
the Web of Data has also been recognized as a foundation for the Semantic Web,
which in spite of a number of dierent interpretations, has been recognized as
a global Web of machine-readable data. Humans are the current Web's seman-
tic component. They are required to process the information available on the
Web to ultimately determine their meaning and relevance for the task at hand.
The Semantic Web intends to move some of that processing to software agents
[7]. In order to discover and map data more precisely, software agents require
machine-readable data and machine-understandable data semantics (metadata).
What the Semantic Web brings to the situation are the new data representation
model (the predicate-based structures to express meaningful assertions) and the
ontologies and rules to enable intelligent software agents to parse meaning from
these assertions (sentences). Intelligent software agents will not be able to `think'
like their human counterparts, but they will be able to reason logically around
the encoded explicit assertions, infer new ones, and assist humans in committing
their tasks.
The Linked Data and the Semantic Web principles are universal; they are
not restricted to any particular domain. As such, they represent promising solu-
tion for semantic integration of currently disconnected environmental data sets
present on the Web. Traditionally, environmental data has been published on
the Web as chunks of digital content, more frequently as text les, in some cases
either stored as XML or marked up as HTML tables. Some HTML documents
containing related environmental data are interlinked but the meaning of the re-
lationships between the linked documents can only be implicitly distinguished.
Hyperlinks indicate that two documents are related in some way, but it mostly
left up to the human user to infer the nature of the relationship. HTML initially
did not provide neither elements enabling typed links between documents nor
between individual entities described in particular documents. Advances towards
this direction, as microformats1 has not been widely adopted either. Environ-
mental data are no exception to this situation, while complexity, spatiotemporal
reference, and uncertainty, make things even worse. Common practice has proven
that environmental data are usually stored in non-reusable raw formats, situated
in sparse locations and managed by dierent authorities, which ultimately raise
obstacles in making environmental information accessible [1]. As a result of that,
environmental data published on the Web looks like sets of disconnected data
islands that are unaware of each other. Having environmental data published in
accordance with the Linked Data and the Semantic Web principles would enable
building of the semantically unied environmental information space, where en-
vironmental information becomes a common asset that is shared among peers,
instead of a resource in scarcity that peers strive for [1].
1 http://microformats.org/about
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


