Nowadays business decisions heavily rely on data in data warehouse systems (DWH), thus data quality (DQ) in DWH is a highly relevant topic. Consequently, sophisticated yet still easy to use solutions for monitoring and ensuring high data quality are needed. This paper is based on the IQM4HD project in which a prototype of an automated data quality monitoring system has been designed and implemented. Specifically, we focus on the aspect of expressing advanced data quality rules such as checking whether data conforms to a certain time series or whether data deviates significantly in any of the dimensions within a data cube. We show how such types of data quality rules can be expressed in our domain specific language (DSL) RADAR which has been introduced in [10]. Since manual specification of such rules tends to be complex, it is particularly important to support the DQ manager in detecting and creating potential rules by profiling of historic data. Thus we also explain the data profiling component of our prototype and illustrate how advanced rules can be semi-automatically detected and suggested to the DQ manager.
CITATION STYLE
Heine, F., Kleiner, C., & Oelsner, T. (2019). Automated Detection and Monitoring of Advanced Data Quality Rules. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11706 LNCS, pp. 238–247). Springer. https://doi.org/10.1007/978-3-030-27615-7_18
Mendeley helps you to discover research relevant for your work.