Abstract
In this paper we investigate the top-k-selection problem, i.e. to determine and sort the top k elements, in the dynamic data model. Here dynamic means that the underlying total order evolves over time, and that the order can only be probed by pair-wise comparisons. It is assumed that at each time step, only one pair of elements can be compared. This assumption of restricted access is reasonable in the dynamic model, especially for massive data set where it is impossible to access all the data before the next change occurs. Previously only two special cases were studied [1] in this model: selecting the element of a given rank, and sorting all elements. This paper systematically deals with k ∈ [n]. Specifically, we identify the critical point k ∗ such that the top-k-selection problem can be solved error-free with probability 1 − o(1) if and only if k = o(k ∗). A lower bound of the error when k = Ω(k ∗) is also determined, which actually is tight under some conditions. In contrast, we show that the top-k-set problem, which means finding the top k elements without sorting them, can be solved error-free with probability 1 − o(1) for all 1 ≤ k ≤ n. Additionally, we consider some extensions of the dynamic data model and show that most of these results still hold.
Cite
CITATION STYLE
Huang, Q., Liu, X., Sun, X., & Zhang, J. (2015). How to select the top k elements from evolving data? In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9472, pp. 60–70). Springer Verlag. https://doi.org/10.1007/978-3-662-48971-0_6
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.