The accurate analysis of data requires high-quality data. However, inconsistencies occur frequently in the actual data and lead to untrustworthy decisions in the downstream data analysis pipeline. In this research, we examine the problem of the detection of incoherence and the repair of the OMD data model (OMD). We propose a framework for data quality evaluation and an OMD repair framework. We formally define a weight-based semantile repair by deletion and have an automated weight generation system that takes into account multiple input criteria. We use multi-criteria decisions based on the correlation, contrast and conflict between multiple criteria that are often necessary in the field of data cleaning. After weight generation, we present a Min-Sum dynamic programming algorithm to find the minimum weight solution. Then we apply evolutionary optimisation techniques and use medical datasets to show improved performance that is practically feasible.
CITATION STYLE
Ramasamy, A., Sisay, B., & Bahiru, A. (2021). A Data Science Framework for Data Quality Assessment and Inconsistency Detection. International Journal of Advanced Computer Science and Applications, 12(4), 605–613. https://doi.org/10.14569/IJACSA.2021.0120476
Mendeley helps you to discover research relevant for your work.