Sign up & Download
Sign in

Chemical Knowledge for the Semantic Web

by Mykola Konyk, Alexander De Leon, Michel Dumontier
World Wide Web Internet And Web Information Systems (2008)

Abstract

With over 80 file formats to represent various chemical attributes, the conversion between one format and another is invariably lossy due to informal specifications. In contrast, the use of a formal knowledge representation language such as the Web Ontology Language (OWL) enables precise molecular descriptions that can be reasoned about in a logically valid manner. In this paper, we describe a chemical knowledge representation using OWL. We demonstrate its utility in querying a new drug repository created from PubChem, DrugBank and DBpedia. By leveraging Semantic Web technologies, it becomes possible to integrate chemical information at differing levels of detail and granularity, opening new avenues for life science knowledge discovery.

Cite this document (BETA)

Available from www.springerlink.com
Page 1
hidden

Chemical Knowledge for the Semantic Web

A. Bairoch, S. Cohen-Boulakia, and C. Froidevaux (Eds.): DILS 2008, LNBI 5109, pp. 169–176, 2008.
© Springer-Verlag Berlin Heidelberg 2008
Chemical Knowledge for the Semantic Web
Mykola Konyk1, Alexander De Leon1, and Michel Dumontier1,2,3
1
School of Computer Science
2
Department of Biology
3
Institute of Biochemistry,
Carleton University, 1125 Colonel By Drive,
K1S 5B6, Ottawa, Canada
mkonyk@gmail.com, alexjdl@gmail.com,
michel_dumontier@carleton.ca
Abstract. With over 80 file formats to represent various chemical attributes, the
conversion between one format and another is invariably lossy due to informal
specifications. In contrast, the use of a formal knowledge representation
language such as the Web Ontology Language (OWL) enables precise
molecular descriptions that can be reasoned about in a logically valid manner.
In this paper, we describe a chemical knowledge representation using OWL.
We demonstrate its utility in querying a new drug repository created from
PubChem, DrugBank and DBpedia. By leveraging Semantic Web technologies,
it becomes possible to integrate chemical information at differing levels of
detail and granularity, opening new avenues for life science knowledge
discovery.
Keywords: semantic web, knowledge representation, knowledge engineering,
ontology, life sciences, question answering, OWL, chemistry, molecule, mashup.
1 Introduction
While powerful web search engines can sift through enormous amounts of
biochemical information online, it is still difficult to find compounds having a set of
desirable attributes i.e. can form specific derivatives, or are stable at room
temperature and have a non-toxic metabolic profile. Although over 80 file formats
exist to represent chemical data, none, including the Chemical Markup Language
(CML) [1], are capable of encoding arbitrarily knowledge in such a way that the
meaning is wholly preserved. Controlled vocabularies have been designed for
chemical functional groups (CO [2]) or compounds (ChEBI [3]), but they are
generally used for the annotation of chemicals or in navigation of search results. In
contrast, Semantic Web ontologies aim to explicitly describe and relate objects using
formal, logic-based representations that a machine can understand and process [4].
This will facilitate knowledge representation, integration and question answering in
areas of critical importance to the life sciences.
In this paper, we describe a knowledge representation for chemical information
using OWL, the Web Ontology Language [5]. OWL facilitates the description of
Page 2
hidden
170 M. Konyk, A. De Leon, and M. Dumontier
complex concepts from simpler ones and can be used for consistency checking and
classification [6]. We describe our efforts to integrate DrugBank and PubChem, two
popular chemical databases and DBpedia, an RDF version of Wikipedia. Finally, we
illustrate the value of using semantic web technologies to seamlessly integrate and
query diverse biochemical knowledge in a manner that opens new avenues for
knowledge discovery in the life sciences.
2 Methods
2.1 Chemical Knowledge Representation
Upper level ontologies increase interoperability and semantic coherency of domain
ontologies by grounding the basic types of domain entities and imposing restrictions
on the relationships that these entities may hold. We use the Basic Formal Ontology
(BFO) [7] because it offers a simple framework that distinguishes objects, qualities,
processes and spatial regions. Our Basic Relation Ontology1 (BRO) provides object-
process, object-quality, parthood, spatial, temporal relations drawn from foundational
work [8]. The New Upper Level Ontology2 (NULO) maps the domain and range
values of BRO properties to BFO concepts, and further constraints on relations are
specified in NULO-constraints3. Reflexive, irreflexive, asymmetric, disjoint roles and
role chains have been added to the BRO-OWL11 ontology4 so as to maximize
reasoning capability [9].
An outline of the chemical knowledge representation is illustrated in Fig 1. Briefly,
molecules, atoms and rings are types of objects that bear qualities and may be located
in spatial regions.
Objects: Molecules, atoms, rings are types of objects that are spatially extended,
maximally self-connected and self-contained and bear any number of qualities
appropriate to their type.
Qualities: A quality is a categorical property that exists in some object. Qualities
have been defined for each kind of object. For instance, a molecule might bear the
quality of monoisotopic mass whereas the partial charge is an atom quality. Some
quality types may be borne by multiple types of objects (i.e. atoms or molecules may
bear a chiral quality). We have identified over 50 types of qualities, largely defined
from OpenBabel and PubChem descriptors.
Mereology: A molecule is composed of at least two or more atoms and has zero or
more ring parts. Molecules or Rings are related to Atoms by hasProperPart, an
asymmetric relation. Molecules and rings are related to each other by hasPart, a
transitive (if a hasPart b and b hasPart c, then a hasPart c) and reflexive (one can
have itself as a part) relation. Thus, rings may also be a molecule (i.e. benzene).

1
http://ontology.dumontierlab.com/bro
2
http://ontology.dumontierlab.com/nulo
3
http://ontology.dumontierlab.com/nulo-constraints
4
http://ontology.dumontierlab.com/bro-owl11

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

9 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
33% Ph.D. Student
 
22% Post Doc
 
11% Student (Master)
by Country
 
44% Germany
 
11% Sweden
 
11% United Kingdom