Load-balancing in distributed selective search

Yubin Kim; Jamie Callan; J. Shane Culpepper; Alistair Moffat

Conference Proceedings

Load-balancing in distributed selective search

SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (2016) 905-908

DOI: 10.1145/2911451.2914689

25Citations

22Readers

Get full text

Abstract

Simulation and analysis have shown that selective search can reduce the cost of large-scale distributed information retrieval. By partitioning the collection into small topical shards, and then using a resource ranking algorithm to choose a subset of shards to search for each query, fewer postings are evaluated. Here we extend the study of selective search using a fine-grained simulation investigating: selective search efficiency in a parallel query processing environment; the difference in efficiency when term-based and sample-based resource selection algorithms are used; and the effect of two policies for assigning index shards to machines. Results obtained for two large datasets and four large query logs confirm that selective search is significantly more efficient than conventional distributed search. In particular, we show that selective search is capable of both higher throughput and lower latency in a parallel environment than is exhaustive search.

Cite

CITATION STYLE

APA

Kim, Y., Callan, J., Culpepper, J. S., & Moffat, A. (2016). Load-balancing in distributed selective search. In SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 905–908). Association for Computing Machinery, Inc. https://doi.org/10.1145/2911451.2914689

Load-balancing in distributed selective search

Abstract

Cite

Register to see more suggestions