Abstract
One major contribution of data warehouses is to support better decision making by facilitating data analysis, and therefore data quality is of primary importance. ETL is the process that extracts, transforms, and ultimately loads data into target warehouses. Although ETL workflows can be designed by ETL tools, data exceptions are largely left to human analysis and handled inadequately. Early detection of exceptions helps to improve the stability and efficiency of ETL workflows. To achieve this goal, a novel approach, Backwards Constraint Propagation (BCP), is proposed that automatically analyzes ETL workflows and verifies the target-end restrictions at their earliest points. BCP builds an ETL graph out of a given ETL workflow, encodes the target-end restrictions as integrity constraints, and propagates them backwards from target to sources through the ETL graph by applying constraint projection rules. It is showed that BCP supports most relational algebra operators and data transformation functions. © 2009 Springer Berlin Heidelberg.
Author supplied keywords
Cite
CITATION STYLE
Liu, J., Liang, S., Ye, D., Wei, J., & Huang, T. (2009). ETL workflow analysis and verification using backwards constraint propagation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5565 LNCS, pp. 455–469). https://doi.org/10.1007/978-3-642-02144-2_36
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.