We present and validate a method and underlying set of technologies, data structures and algorithms to calculate, categorize and visualize component dependencies, data lineage and business semantics from the database structures and queries, independently of actual data in the data warehouse. Chosen approach based on semantic techniques, probabilistic weight calculation and estimation of the impact of data in queries and implemented rule system supports the calculation of the dependency graph from these estimates. We demonstrate a method for business semantics integration and ontology learning from data structures and schemas with a combination of query semantics captured by dependency graph. Annotation of technical assets using a business ontology provides meaning and governance view for human and machine agents to address various planning, automation and decision support problems. Data processing performance and business ontology integration is evaluated and analyzed over several real-life datasets.
CITATION STYLE
Tomingas, K., Järv, P., & Tammet, T. (2019). Computing data lineage and business semantics for data warehouse. In Communications in Computer and Information Science (Vol. 914, pp. 101–124). Springer Verlag. https://doi.org/10.1007/978-3-319-99701-8_5
Mendeley helps you to discover research relevant for your work.