DeShaTo: Describing the shape of cumulative topic distributions to rank retrieval systems without relevance judgments

5Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper investigates an approach for estimating the effectiveness of any IR system. The approach is based on the idea that a set of documents retrieved for a specific query is highly relevant if there are only a small number of predominant topics in the retrieved documents. The proposed approach is to determine the topic probability distribution of each document offline, using Latent Dirichlet Allocation. Then, for a retrieved set of documents, a set of probability distribution shape descriptors, namely the skewness and the kurtosis, are used to compute a score based on the shape of the cumulative topic distribution of the respective set of documents. The proposed model is termed DeShaTo, which is short for Describing the Shape of cumulative Topic distributions. In this work, DeShaTo is used to rank retrieval systems without relevance judgments. In most cases, the empirical results are better than the state of the art approach. Compared to other approaches, DeShaTo works independently for each system. Therefore, it remains reliable even when there are less systems to be ranked by relevance.

Cite

CITATION STYLE

APA

Ionescu, R. T., Chifu, A. G., & Mothe, J. (2015). DeShaTo: Describing the shape of cumulative topic distributions to rank retrieval systems without relevance judgments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9309, pp. 75–82). Springer Verlag. https://doi.org/10.1007/978-3-319-23826-5_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free