Rule discovery for (semi-)automatic repairs of etl processes

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A data source integration layer, commonly called extract-transform-load (ETL), is one of the core components of information systems. It is applicable to standard data warehouse (DW) architectures as well as to data lake (DL) architectures. The ETL layer runs processes that ingest, transform, integrate, and upload data into a DW or DL. The ETL layer is not static, since the data sources being integrated by this layer change their structures. As a consequence, an already deployed ETL process stops working and needs to be re-designed (repaired). Companies typically have deployed from thousands to hundreds of thousands of ETL processes. For this reason, a technique and software support for repairing semi-automatically a failed ETL processes is of vital practical importance. This problem has been only partially solved by technology or research, but the solutions still require an immense work of an ETL administrator. Our solution is based on a case-based-reasoning combined with repair rules. In this paper, we contribute a method for automatic discovery of repair rules from a stored history of repair cases.

Cite

CITATION STYLE

APA

Awiti, J., & Wrembel, R. (2020). Rule discovery for (semi-)automatic repairs of etl processes. In Communications in Computer and Information Science (Vol. 1243 CCIS, pp. 250–264). Springer. https://doi.org/10.1007/978-3-030-57672-1_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free