Effective NL paraphrasing of ontologies on the semantic web (Technical Report
Workshop on EndUser Semantic Web Interaction (2005)
Available from
Daniel Hewlett's profile on Mendeley.
or
Abstract
(Technical Report 6 citations 1 self. Download: http://sunsite.informatik.rwth-aachen.de/Publicati CACHED: Download as a PDF. by Daniel Hewlett , Aditya Kalyanpur
Available from
Daniel Hewlett's profile on Mendeley.
Page 1
Effective NL paraphrasing of ontologies on the semantic web (Technical Report
Effective NL Paraphrasing of Ontologies on the
Semantic Web
Daniel Hewlett1, Aditya Kalyanpur2, Vladimir Kolovski2,
Christian Halaschek-Wiener2
1hewlett@umd.edu 2{aditya, kolovski, halasche} @cs.umd.edu
Dept. of Computer Science,
University of Maryland,
College Park MD 20742
Abstract. In this paper, we present an algorithm that provides natural
language (NL) paraphrases for OWL Ontologies on the Semantic Web.
Our goal is to ensure both fluency (readability) and accuracy of the out-
put, in terms of preserving the meaning conveyed by its description logic
formalism. The approach described is a generic domain-independent one,
and is completely automated. We describe details about the algorithm
and follow it up with a subjective evaluation (pilot study) of our ap-
proach using real world ontologies comparing it with current tools that
provide similar functionality.
1 Motivation and Goals
With the advent of OWL, and its subset OWL-DL, semantic web content
is backed by a precisely-defined Description Logic (DL). This property
means that the meaning of semantic web content will always be clear and
potentially useful to an intelligent agent, or reasoner-equipped software
application. However, concept definitions (OWL Classes) are specified
in the language of logic, requiring humans to understand this logical
language in order to decipher the meaning of concepts. For end users of
semantic web enabled applications, this may pose a usability problem
in many important circumstances, effectively creating a barrier for entry
into the semantic web.
To remove this barrier, we have designed and implemented a procedure
for generating near-Natural Language (NL) paraphrases (in English) of
OWL concept definitions that preserve the semantics of the DL descrip-
tion. These paraphrases can be presented to the user either in addition to
or instead of the logical class definitions. By presenting the class names
and English definitions, designers can keep users in an environment of
entirely natural language-based interaction, while not losing the semantic
rigor and precision that OWL provides.
For a procedure such as ours to be widely useful, it has to be not only
robust but also domain-independent, able to work with a large number
of the concepts and ontologies available. A domain-independent solution
is desirable because it can immediately make use of the numerous OWL
Semantic Web
Daniel Hewlett1, Aditya Kalyanpur2, Vladimir Kolovski2,
Christian Halaschek-Wiener2
1hewlett@umd.edu 2{aditya, kolovski, halasche} @cs.umd.edu
Dept. of Computer Science,
University of Maryland,
College Park MD 20742
Abstract. In this paper, we present an algorithm that provides natural
language (NL) paraphrases for OWL Ontologies on the Semantic Web.
Our goal is to ensure both fluency (readability) and accuracy of the out-
put, in terms of preserving the meaning conveyed by its description logic
formalism. The approach described is a generic domain-independent one,
and is completely automated. We describe details about the algorithm
and follow it up with a subjective evaluation (pilot study) of our ap-
proach using real world ontologies comparing it with current tools that
provide similar functionality.
1 Motivation and Goals
With the advent of OWL, and its subset OWL-DL, semantic web content
is backed by a precisely-defined Description Logic (DL). This property
means that the meaning of semantic web content will always be clear and
potentially useful to an intelligent agent, or reasoner-equipped software
application. However, concept definitions (OWL Classes) are specified
in the language of logic, requiring humans to understand this logical
language in order to decipher the meaning of concepts. For end users of
semantic web enabled applications, this may pose a usability problem
in many important circumstances, effectively creating a barrier for entry
into the semantic web.
To remove this barrier, we have designed and implemented a procedure
for generating near-Natural Language (NL) paraphrases (in English) of
OWL concept definitions that preserve the semantics of the DL descrip-
tion. These paraphrases can be presented to the user either in addition to
or instead of the logical class definitions. By presenting the class names
and English definitions, designers can keep users in an environment of
entirely natural language-based interaction, while not losing the semantic
rigor and precision that OWL provides.
For a procedure such as ours to be widely useful, it has to be not only
robust but also domain-independent, able to work with a large number
of the concepts and ontologies available. A domain-independent solution
is desirable because it can immediately make use of the numerous OWL
Page 2
2ontologies that already exist, modeling everything from clinical and en-
vironmental information (e.g., NCI and JPL) to personal interests and
relationships (e.g., FOAF). A domain-specific procedure, however, will
need to be re-tuned to each domain or ontology, greatly increasing the
amount of work on the part of ontology designers. Also, distributing
and integrating the extra domain-specific information required by such
programs would add a layer of data not included in standard OWL.
Our approach fulfills all of these criteria. First, because we use only
the names of properties and classes, which are already present in the
OWL ontology, we do not require extensions to the ontology or additional
information sources. This makes our approach valid for any ontology
where the classes and properties are named appropriately. Also, the most
sophisticated NL processing tool our approach utilizes is a part-of-speech
(POS) tagger, which is a fast and simple application. Slightly better
results could possibly be generated using a richer set of NL abilities,
such as conjugation of verb forms, grammaticality judgments, parsing,
etc., but such an application would be much less efficient.
2 Related Work: Current State of the Art
As discussed earlier, our aim is to devise an algorithm for generating NL
explanations of a conceptual term defined in OWL. We intend to build
upon and extend the results of previous efforts in this area, which are
briefly discussed here.
An excellent example of the instructional use of NL paraphrases for un-
derstanding OWL Concepts is described in [5]. We take inspiration from
this work and attempt to automatically generate NL paraphrases such
as the ones illustrated in their paper. Also note that at the time of
writing this paper, the authors know of no implementation that has
achieved this. The Class Description Display plugin (http://www.co-
ode.org/downloads/cdc/) mentioned in their paper works with the Pro-
tege OWL plugin and provides simple quasi-NL descriptions that resem-
ble OWL Abstract Syntax (http://www.w3.org/TR/owl-semantics/). We
note that, in general, the OWL AS while a step above RDF/XML in
terms of readability is still very complex for novice end users (for an
small example of this, see Figure 1).
In [2], a technique for mapping elementary semantic expressions to corre-
sponding NL representations is presented. In their approach, the authors
apply multiple sequence alignment techniques to a semantic expression
along with corresponding alternative verbalizations. This then produces
a more expressive and accurate single dictionary entry. Our approach
differs in that we do not assume the verbalizations. We are actually gen-
erating the verbalizations algorithmically from the semantic expression
itself.
In [3], a subset of English is introduced called Attempto Controlled Eng-
lish. ACE is translated unambiguously into first-order logic and thus can
be used as a formal notation. Even though ACE seems to be a NL, it
is actually a formal language with the semantics of First Order Logic
(FOL). In comparison, our tool converts OWL classes, which are based
vironmental information (e.g., NCI and JPL) to personal interests and
relationships (e.g., FOAF). A domain-specific procedure, however, will
need to be re-tuned to each domain or ontology, greatly increasing the
amount of work on the part of ontology designers. Also, distributing
and integrating the extra domain-specific information required by such
programs would add a layer of data not included in standard OWL.
Our approach fulfills all of these criteria. First, because we use only
the names of properties and classes, which are already present in the
OWL ontology, we do not require extensions to the ontology or additional
information sources. This makes our approach valid for any ontology
where the classes and properties are named appropriately. Also, the most
sophisticated NL processing tool our approach utilizes is a part-of-speech
(POS) tagger, which is a fast and simple application. Slightly better
results could possibly be generated using a richer set of NL abilities,
such as conjugation of verb forms, grammaticality judgments, parsing,
etc., but such an application would be much less efficient.
2 Related Work: Current State of the Art
As discussed earlier, our aim is to devise an algorithm for generating NL
explanations of a conceptual term defined in OWL. We intend to build
upon and extend the results of previous efforts in this area, which are
briefly discussed here.
An excellent example of the instructional use of NL paraphrases for un-
derstanding OWL Concepts is described in [5]. We take inspiration from
this work and attempt to automatically generate NL paraphrases such
as the ones illustrated in their paper. Also note that at the time of
writing this paper, the authors know of no implementation that has
achieved this. The Class Description Display plugin (http://www.co-
ode.org/downloads/cdc/) mentioned in their paper works with the Pro-
tege OWL plugin and provides simple quasi-NL descriptions that resem-
ble OWL Abstract Syntax (http://www.w3.org/TR/owl-semantics/). We
note that, in general, the OWL AS while a step above RDF/XML in
terms of readability is still very complex for novice end users (for an
small example of this, see Figure 1).
In [2], a technique for mapping elementary semantic expressions to corre-
sponding NL representations is presented. In their approach, the authors
apply multiple sequence alignment techniques to a semantic expression
along with corresponding alternative verbalizations. This then produces
a more expressive and accurate single dictionary entry. Our approach
differs in that we do not assume the verbalizations. We are actually gen-
erating the verbalizations algorithmically from the semantic expression
itself.
In [3], a subset of English is introduced called Attempto Controlled Eng-
lish. ACE is translated unambiguously into first-order logic and thus can
be used as a formal notation. Even though ACE seems to be a NL, it
is actually a formal language with the semantics of First Order Logic
(FOL). In comparison, our tool converts OWL classes, which are based
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
3 Readers on Mendeley
by Discipline
by Academic Status
100% Ph.D. Student
by Country
67% United States
33% United Kingdom


