What's new? What's certain? - Scoring search results in the presence of overlapping data sources

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data integration projects in the life sciences often gather data on a particular subject from multiple sources. Some of these sources overlap to a certain degree. Therefore, integrated search results may be supported by one, few, or all data sources. To reflect these differences, results should be ranked according to the number of data sources that support them. How such a ranking should look like is not clear per se. Either, results supported by only few sources are ranked high because this information is potentially new, or such results are ranked low because the strength of evidence supporting them is limited. We present two scoring schemes to rank search results in the integrated protein annotation database Columba. We define a surprisingness score, preferring results supported by few sources, and a confidence score, preferring frequently encountered information. Unlike many other scoring schemes our proposal is purely data-driven and does not require users to specify preferences among sources. Both scores take the concrete overlaps of data sources into account and do not presume statistical independence. We show how our schemes have been implemented efficiently using SQL. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Hussels, P., Trißl, S., & Leser, U. (2007). What’s new? What’s certain? - Scoring search results in the presence of overlapping data sources. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4544 LNBI, pp. 231–246). Springer Verlag. https://doi.org/10.1007/978-3-540-73255-6_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free