Provenance in a modifiable data set

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Provenance of data is now widely recognized as being of great importance, thanks in large part to pioneering work [4, 6] by Peter Buneman and his collaborators in a stream that continues to produce influential papers today [1-3, 7]. When we consume data from a database, we often care about where these data come from, how they were derived, and so forth. We may desire answers to such questions to establish trust in the data, to investigate suspicious values, to debug code in the system, or for a host of other reasons. Considerable recent work has addressed many issues related to provenance. However, the standard assumption is that data sources, from which result data have been derived, are static. In reality, we know that most data are modified over time, including data sources used for deriving results of interest. When we consider provenance in the context of such modifications, many new problems arise. This chapter addresses two key problems in this context: 1 Result data may no longer be valid after a source update. How can we efficiently determine whether a given result tuple is valid? When a result tuple is invalidated, can we explain what caused this invalidation? 2 We may have lost access to (some) source data. In such a situation, can we determine what is the missing source data on which some result tuple depends? © 2013 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Zhang, J., & Jagadish, H. V. (2013). Provenance in a modifiable data set. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8000, 557–567. https://doi.org/10.1007/978-3-642-41660-6_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free