Data quality (DQ) measures data status based on different dimensions. This broad topic was brought to the fore in the'80s when it was first discussed and studied. A high-quality dataset correlates with good performance in artificial intelligence (AI) algorithms and decision-making processes. Therefore, checking the quality of the data inside a decision support system (DSS) is an essential pre-processing step and is beneficial for improving further analysis. In this paper, a theoretical framework for a DQ module for a DSS is proposed. The framework evaluates the quality status in three stages: as based on the European guidelines, as based on DQ metrics, and as based on checking a subset of data cleaning (DC) problems. Additionally, the framework supports the user in identifying and fixing the DC problems, which speeds up the process. As output, the user receives a DQ report and the DC pipeline to execute to improve the dataset's quality. An implementation of the framework is illustrated in a proof-of-concept (POC) for an industrial use case. In the POC, an example of the execution of the various framework phases was shown using a public time series dataset containing quarter-hourly consumption profiles of residential electricity customers in Belgium for the year 2016.
CITATION STYLE
Rinaldi, G., Garcia, F. C., Agudelo, O. M., Becker, T., Vanthournout, K., Mestdagh, W., & De Moor, B. (2023). A Framework for a Data Quality Module in Decision Support Systems: An Application with Smart Grid Time Series. In International Conference on Enterprise Information Systems, ICEIS - Proceedings (Vol. 1, pp. 443–452). Science and Technology Publications, Lda. https://doi.org/10.5220/0011749700003467
Mendeley helps you to discover research relevant for your work.