We show that several previously proposed passage-based document ranking principles, along with some new ones, can be derived from the same probabilistic model. We use language models to instantiate specific algorithms, and propose a passage language model that integrates information from the ambient document to an extent controlled by the estimated document homogeneity. Several document-homogeneity measures that we propose yield passage language models that are more effective than the standard passage model for basic document retrieval and for constructing and utilizing passage-based relevance models; the latter outperform a document-based relevance model. We also show that the homogeneity measures are effective means for integrating document-query and passage-query similarity information for document retrieval. © 2008 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Bendersky, M., & Kurland, O. (2008). Utilizing passage-based language models for document retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4956 LNCS, pp. 162–174). https://doi.org/10.1007/978-3-540-78646-7_17
Mendeley helps you to discover research relevant for your work.