Two-stage learning to rank for information retrieval

Van Dang; Michael Bendersky; W. Bruce Croft

Conference Proceedings

Two-stage learning to rank for information retrieval

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 7814 LNCS 423-434

DOI: 10.1007/978-3-642-36973-5_36

40Citations

240Readers

Get full text

Abstract

Current learning to rank approaches commonly focus on learning the best possible ranking function given a small fixed set of documents. This document set is often retrieved from the collection using a simple unsupervised bag-of-words method, e.g. BM25. This can potentially lead to learning a sub-optimal ranking, since many relevant documents may be excluded from the initially retrieved set. In this paper we propose a novel two-stage learning framework to address this problem. We first learn a ranking function over the entire retrieval collection using a limited set of textual features including weighted phrases, proximities and expansion terms. This function is then used to retrieve the best possible subset of documents over which the final model is trained using a larger set of query- and document-dependent features. Empirical evaluation using two web collections unequivocally demonstrates that our proposed two-stage framework, being able to learn its model from more relevant documents, outperforms current learning to rank approaches. © 2013 Springer-Verlag.

Cite

CITATION STYLE

APA

Dang, V., Bendersky, M., & Croft, W. B. (2013). Two-stage learning to rank for information retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7814 LNCS, pp. 423–434). https://doi.org/10.1007/978-3-642-36973-5_36

Two-stage learning to rank for information retrieval

Abstract

Cite

Register to see more suggestions