Modeling data lakes with data vault: Practical experiences, assessment, and lessons learned

Corinna Giebler; Christoph Gröger; Eva Hoos; Holger Schwarz; Bernhard Mitschang

Conference Proceedings

Modeling data lakes with data vault: Practical experiences, assessment, and lessons learned

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11788 LNCS 63-77

DOI: 10.1007/978-3-030-33223-5_7

18Citations

39Readers

Get full text

Abstract

Data lakes have become popular to enable organization-wide analytics on heterogeneous data from multiple sources. Data lakes store data in their raw format and are often characterized as schema-free. Nevertheless, it turned out that data still need to be modeled, as neglecting data modeling may lead to issues concerning e.g., quality and integration. In current research literature and industry practice, Data Vault is a popular modeling technique for structured data in data lakes. It promises a flexible, extensible data model that preserves data in their raw format. However, hardly any research or assessment exist on the practical usage of Data Vault for modeling data lakes. In this paper, we assess the Data Vault model’s suitability for the data lake context, present lessons learned, and investigate success factors for the use of Data Vault. Our discussion is based on the practical usage of Data Vault in a large, global manufacturer’s data lake and the insights gained in real-world analytics projects.

Author supplied keywords

Cite

CITATION STYLE

APA

Giebler, C., Gröger, C., Hoos, E., Schwarz, H., & Mitschang, B. (2019). Modeling data lakes with data vault: Practical experiences, assessment, and lessons learned. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11788 LNCS, pp. 63–77). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-33223-5_7

Modeling data lakes with data vault: Practical experiences, assessment, and lessons learned

Abstract

Author supplied keywords

Cite

Register to see more suggestions