Document-at-a-time (DaaT) and score-at-a-time (SaaT) query evaluation techniques are different approaches to top-k retrieval with inverted indexes. While modern systems are dominated by DaaT, the academic literature has seen decades of debate about the merits of each. Recently, there has been renewed interest in SaaT methods for learned sparse lexical models, where studies have shown that transformers generate "wacky weights"that appear to reduce opportunities for optimizations in DaaT methods. However, researchers currently lack an easy-to-use SaaT system to support further exploration. This is the gap that our work fills. Starting with a modern SaaT system (JASS), we built Python bindings in order to integrate into the DaaT Pyserini IR toolkit (Lucene). The result is a common frontend to both a DaaT and a SaaT system. We demonstrate how recent experiments with a wide range of learned sparse lexical models can be easily reproduced. Our contribution is a framework that enables future research comparing DaaT and SaaT methods in the context of modern neural retrieval models.
CITATION STYLE
Trotman, A., MacKenzie, J., Parameswaran, P., & Lin, J. (2022). A Common Framework for Exploring Document-at-a-Time and Score-at-a-Time Retrieval Methods. In SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 3229–3234). Association for Computing Machinery, Inc. https://doi.org/10.1145/3477495.3531657
Mendeley helps you to discover research relevant for your work.