Drawing density core-sets from incomplete relational data

Yongnan Liu; Jianzhong Li; Hong Gao

Conference Proceedings

Drawing density core-sets from incomplete relational data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10178 LNCS 527-542

DOI: 10.1007/978-3-319-55699-4_32

0Citations

3Readers

Get full text

Abstract

Incompleteness is a ubiquitous issue and brings challenges to answer queries with completeness guaranteed. A density core-set is a subset of an incomplete dataset, whose completeness is approximate to the completeness of the entire dataset. Density core-sets are effective mechanisms to estimate completeness of queries on incomplete datasets. This paper studies the problems of drawing density core-sets on incomplete relational data. To the best of our knowledge, there is no such proposal in the past. (1) We study the problems of drawing density core-sets in different requirements, and prove the problems are all NP-Complete whether functional dependencies are given. (2) An efficient approximate algorithm to draw an approximate density core-set is proposed, where an approximate Knapsack algorithm and weighted sampling techniques are employed to select important candidate tuples. (3) Analysis of the proposed approximate algorithm shows the relative error between completeness of the approximate density core-set and that of a density core-set with same size is within a given relative error bound with high probability. (4) Experiments on both real-world and synthetic datasets demonstrate the effectiveness and efficiency of the algorithm.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, Y., Li, J., & Gao, H. (2017). Drawing density core-sets from incomplete relational data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10178 LNCS, pp. 527–542). Springer Verlag. https://doi.org/10.1007/978-3-319-55699-4_32

Drawing density core-sets from incomplete relational data

Abstract

Author supplied keywords

Cite

Register to see more suggestions