Why and where: A characterization of data provenance?

Peter Buneman; Sanjeev Khanna; Wang Chiew Tan

Conference Proceedings

Why and where: A characterization of data provenance?

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 1973 316-330

DOI: 10.1007/3-540-44503-x_20

718Citations

384Readers

Get full text

Abstract

With the proliferation of database views and curated databases, the issue of data provenance - where a piece of data came from and the process by which it arrived in the database - is becoming increasingly important, especially in scientific databases where understanding provenance is crucial to the accuracy and currency of data. In this paper we describe an approach to computing provenance when the data of interest has been created by a database query. We adopt a syntactic approach and present results for a general data model that applies to relational databases as well as to hierarchical data such as XML. A novel aspect of our work is a distinction between “why" provenance (refers to the source data that had some influence on the existence of the data) and “where" provenance (refers to the location(s) in the source databases from which the data was extracted).

Cite

CITATION STYLE

APA

Buneman, P., Khanna, S., & Tan, W. C. (2001). Why and where: A characterization of data provenance? In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1973, pp. 316–330). Springer Verlag. https://doi.org/10.1007/3-540-44503-x_20

Why and where: A characterization of data provenance?

Abstract

Cite

Register to see more suggestions