Provenance in a modifiable data set

Jing Zhang; H. V. Jagadish

Journal Article

Provenance in a modifiable data set

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8000 557-567

DOI: 10.1007/978-3-642-41660-6_31

0Citations

1Readers

Get full text

Abstract

Provenance of data is now widely recognized as being of great importance, thanks in large part to pioneering work [4, 6] by Peter Buneman and his collaborators in a stream that continues to produce influential papers today [1-3, 7]. When we consume data from a database, we often care about where these data come from, how they were derived, and so forth. We may desire answers to such questions to establish trust in the data, to investigate suspicious values, to debug code in the system, or for a host of other reasons. Considerable recent work has addressed many issues related to provenance. However, the standard assumption is that data sources, from which result data have been derived, are static. In reality, we know that most data are modified over time, including data sources used for deriving results of interest. When we consider provenance in the context of such modifications, many new problems arise. This chapter addresses two key problems in this context: 1 Result data may no longer be valid after a source update. How can we efficiently determine whether a given result tuple is valid? When a result tuple is invalidated, can we explain what caused this invalidation? 2 We may have lost access to (some) source data. In such a situation, can we determine what is the missing source data on which some result tuple depends? © 2013 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Zhang, J., & Jagadish, H. V. (2013). Provenance in a modifiable data set. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8000, 557–567. https://doi.org/10.1007/978-3-642-41660-6_31

Provenance in a modifiable data set

Abstract

Cite

Register to see more suggestions