What to Read Next? Challenges and preliminary results in selecting representative documents

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The vast amount of scientific literature poses a challenge when one is trying to understand a previously unknown topic. Selecting a representative subset of documents that covers most of the desired content can solve this challenge by presenting the user a small subset of documents. We build on existing research on representative subset extraction and apply it in an information retrieval setting. Our document selection process consists of three steps: computation of the document representations, clustering, and selection of documents. We implement and compare two different document representations, two different clustering algorithms, and three different selection methods using a coverage and a redundancy metric. We execute our 36 experiments on two datasets, with 10 sample queries each, from different domains. The results show that there is no clear favorite and that we need to ask the question whether coverage and redundancy are sufficient for evaluating representative subsets.

Cite

CITATION STYLE

APA

Beck, T., Böschen, F., & Scherp, A. (2018). What to Read Next? Challenges and preliminary results in selecting representative documents. In Communications in Computer and Information Science (Vol. 903, pp. 230–242). Springer Verlag. https://doi.org/10.1007/978-3-319-99133-7_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free