Empirical Analysis of Ranking Models for an Adaptable Dataset Search

3Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Currently available datasets still have a large unexplored potential for interlinking. Ranking techniques contribute to this task by scoring datasets according to the likelihood of finding entities related to those of a target dataset. Ranked datasets can be either manually selected for standalone linking discovery tasks or automatically inspected by programs that would go through the ranking looking for entity links. This work presents empirical comparisons between different ranking models and argues that different algorithms could be used depending on whether the ranking is manually or automatically handled and, also, depending on the available metadata of the datasets. Experiments indicate that ranking algorithms that performed best with nDCG do not always have the best Recall at Position k, for high recall levels. The best ranking model for the manual use case (with respect to nDCG) may need 13% more datasets for 90% of recall, i.e., instead of just a slice of 34% of the datasets at the top of the ranking, reached by the best model for the automatic use case (with respect to recall@k), it would need almost 47% of the ranking.

Cite

CITATION STYLE

APA

Neves, A. B., de Oliveira, R. G. G., Leme, L. A. P. P., Lopes, G. R., Nunes, B. P., & Casanova, M. A. (2018). Empirical Analysis of Ranking Models for an Adaptable Dataset Search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10843 LNCS, pp. 50–64). Springer Verlag. https://doi.org/10.1007/978-3-319-93417-4_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free