Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments that can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, and 2) computing data-adaptive scan depths for different input sources. The paper presents comprehensive experiments with two different real-life datasets, using the ns-2 network simulator for a packet-level simulation of a large Internet-style network. © 2008 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Neumann, T., Bender, M., Michel, S., Schenkel, R., Triantafillou, P., & Weikum, G. (2008). Optimizing distributed top-k queries. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5175 LNCS, pp. 337–349). https://doi.org/10.1007/978-3-540-85481-4_26
Mendeley helps you to discover research relevant for your work.