Deriving probabilistic databases with inference ensembles

11Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Many real-world applications deal with uncertain or missing data, prompting a surge of activity in the area of probabilistic databases. A shortcoming of prior work is the assumption that an appropriate probabilistic model, along with the necessary probability distributions, is given. We address this shortcoming by presenting a framework for learning a set of inference ensembles, termed meta-rule semi-lattices, or MRSL, from the complete portion of the data. We use the MRSL to infer probability distributions for missing data, and demonstrate experimentally that high accuracy is achieved when a single attribute value is missing per tuple. We next propose an inference algorithm based on Gibbs sampling that accurately predicts the probability distribution for multiple missing values. We also develop an optimization that greatly improves performance of multi-attribute inference for collections of tuples, while maintaining high accuracy. Finally, we develop an experimental framework to evaluate the efficiency and accuracy of our approach. © 2011 IEEE.

Cite

CITATION STYLE

APA

Stoyanovich, J., Davidson, S., Milo, T., & Tannen, V. (2011). Deriving probabilistic databases with inference ensembles. In Proceedings - International Conference on Data Engineering (pp. 303–314). https://doi.org/10.1109/ICDE.2011.5767854

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free