Abstract
Motivated by recent commentary that has questioned today's pursuit of ever-more complex models and mathematical formalisms in applied machine learning and whether meaningful empirical progress is actually being made, this paper tackles the decades-old problem of pseudo-relevance feedback with "the simplest thing that can possibly work". We present a technique based on training a document relevance classifier for each information need using pseudo-labels from an initial ranked list and then applying the classifier to rerank the retrieved documents. Experiments demonstrate significant improvements across a number of standard newswire collections, with initial rankings supplied by bag-of-words BM25 as well as from query expansion. Further evaluations in the TREC-COVID challenge using human relevance judgments verify the effectiveness and robustness of our proposed technique. While this simple idea draws elements from several well-known threads in the literature, to our knowledge this exact combination has not previously been proposed and rigorously evaluated.
Author supplied keywords
Cite
CITATION STYLE
Han, X., Liu, Y., & Lin, J. (2021). The Simplest Thing That Can Possibly Work: (Pseudo-)Relevance Feedback via Text Classification. In ICTIR 2021 - Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval (pp. 123–129). Association for Computing Machinery, Inc. https://doi.org/10.1145/3471158.3472261
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.