A brave new (virtual) world: distributed searches, relevance scoring and facets

  • King T
  • Narock T
  • Walker R
  • et al.
N/ACitations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Our ability to deal with complex systems has improved through information system research which includes improved modeling (both data and system), the use of semantics and advances in distributed computing. The past decade has seen an explosion in the amount and variety of geosciences data and the emergence of true open data repositories through which scientists can freely access this data. Those data are found in thousands of repositories located around the world. Virtual observatories have been created to address the challenge of helping scientists search those repositories to find and access the required data. This challenge is been addressed by using technologies such as the Internet (with ample connectivity and bandwidth), the Web, cheap computing power, cheap storage and standards for critical components. Many scientific disciplines are developing virtual observatories. Yet some of the most compelling science questions cross multiple domains. While semantics can provide cross domain reasoning, often the first step in answering a question is determining what resources are available which may be relevant to a topic. The topic can be expressed as simple phrases or word sequences. Using a common relevance scoring method at all locations can enable a federated search across loosely coupled providers. The results of which can be organized into facets to aid the user in selecting the most promising resources with which to pursue the scientific investigation. We describe an approach to developing and deploying relevance scoring methods and faceted results in this brave new (virtual) world. We have found that a scoring method which considers both the presence of terms and the proximity of these terms relative to the order of the terms in the query improves the assessment of relevance. We call this Term Presence-Proximity (TPP) scoring and describe a method for calculating a normalized score. TPP scoring compares favorably with other scoring approaches.

Cite

CITATION STYLE

APA

King, T., Narock, T., Walker, R., Merka, J., & Joy, S. (2008). A brave new (virtual) world: distributed searches, relevance scoring and facets. Earth Science Informatics, 1(1), 29–34. https://doi.org/10.1007/s12145-008-0002-7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free