Similarity between semantic description sets: Addressing needs beyond data integration

ISSN: 16130073
2Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

Abstract

Descriptive information is easy to understand and communicate in natural language. Examples in the biological realm include the cellular functions of proteins and the phenotypes exhibited by organisms. Large latent stores of such descriptive data are stored in databases that can be mined, but even more still reside only in the scientific literature. Although such information has traditionally been opaque to computers, in recent years significant efforts have gone into exposing descriptive information to computation through the development of ontologies and associated tools. A host of software applications now employ simple reasoning over Gene Ontology annotated data to help interpret experimental findings in genomics in terms of protein function. In the domain of biological phenotypes, the combination of entity terms from taxon-specific anatomy ontologies with quality terms from generic ontologies such as PATO have been used to construct semantically precise and contextualized descriptions. It is natural for multiple semantic descriptions to pertain to single instances in the real world, as in the case of both protein functions and organismal phenotypes. However, applications for ontology-based annotations that go beyond simple knowledge organization, and that exploit sets of semantic descriptions, are puzzlingly rare. In particular, we argue that there is wide applicability, and a sore need, for tools that can satisfy the simple, common use case of identifying statistically improbable similarity between sets of semantic descriptions. Several metrics have been proposed for this task in the literature, but not yet fully evaluated, explored, and adopted. The requirements for semantic similarity tools tailored to sets of semantic descriptions would include speed, scalability to large numbers of sets, demonstrated statistical and biological validity, and ease of use.

Cite

CITATION STYLE

APA

Vision, T., Blake, J., Lapp, H., Mabee, P., & Westerfield, M. (2011). Similarity between semantic description sets: Addressing needs beyond data integration. In CEUR Workshop Proceedings (Vol. 783). CEUR-WS.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free