Traditional query answering returns all answers T to a given query. When T is large, the user may be interested in viewing only a smaller subset S of T. Previous work has focused on finding subsets S that are diverse, i.e., such that all items s,s' in S are very different one from another. This paper focuses on a complementary problem, namely finding subsets that are highly representative of the entire set of query results. Intuitively, a representative subset S is similar, in values and proportionality, to the entire set T. Finding such a representative set is challenging, both conceptually, and in practice. This paper proposes a novel method of choosing a representative subset, called SimSTV, which draws inspiration from the field of voting theory. An efficient algorithm is presented, which overcomes and leverages the many differences between choosing answers in a database, and voting in a real-life election. We also provide extensions to our algorithm, e.g., to accommodate affirmative action. Experimental results show the effectiveness of our algorithm.
CITATION STYLE
Behar, R., & Cohen, S. (2022). Representative Query Results by Voting. In Proceedings of the ACM SIGMOD International Conference on Management of Data (pp. 1741–1754). Association for Computing Machinery. https://doi.org/10.1145/3514221.3517858
Mendeley helps you to discover research relevant for your work.