Query Exposure Prediction for Groups of Documents in Rankings

Thomas Jaenich; Graham McDonald; Iadh Ounis

Conference Proceedings

Query Exposure Prediction for Groups of Documents in Rankings

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2024) 14609 LNCS 143-158

DOI: 10.1007/978-3-031-56060-6_10

0Citations

1Readers

Get full text

Abstract

The main objective of an Information Retrieval (IR) system is to provide a user with the most relevant documents to the user’s query. To do this, modern IR systems typically deploy a re-ranking pipeline in which a set of documents is retrieved by a lightweight first-stage retrieval process and then re-ranked by a more effective but expensive model. However, the success of a re-ranking pipeline is heavily dependent on the performance of the first stage retrieval, since new documents are not usually identified during the re-ranking stage. Moreover, this can impact the amount of exposure that a particular group of documents, such as documents from a particular demographic group, can receive in the final ranking. For example, the fair allocation of exposure becomes more challenging or impossible if the first stage retrieval returns too few documents from certain groups, since the number of group documents in the ranking affects the exposure more than the documents’ positions. With this in mind, it is beneficial to predict the amount of exposure that a group of documents is likely to receive in the results of the first stage retrieval process, in order to ensure that there are a sufficient number of documents included from each of the groups. In this paper, we introduce the novel task of query exposure prediction (QEP). Specifically, we propose the first approach for predicting the distribution of exposure that groups of documents will receive for a given query. Our new approach, called GEP, uses lexical information from individual groups of documents to estimate the exposure the groups will receive in a ranking. Our experiments on the TREC 2021 and 2022 Fair Ranking Track test collections show that our proposed GEP approach results in exposure predictions that are up to ∼40% more accurate than the predictions of suitably adapted existing query performance prediction (QPP) and resource allocation approaches.

Cite

CITATION STYLE

APA

Jaenich, T., McDonald, G., & Ounis, I. (2024). Query Exposure Prediction for Groups of Documents in Rankings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14609 LNCS, pp. 143–158). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-56060-6_10

Query Exposure Prediction for Groups of Documents in Rankings

Abstract

Cite

Register to see more suggestions