A Data Science Framework for Data Quality Assessment and Inconsistency Detection

0Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

The accurate analysis of data requires high-quality data. However, inconsistencies occur frequently in the actual data and lead to untrustworthy decisions in the downstream data analysis pipeline. In this research, we examine the problem of the detection of incoherence and the repair of the OMD data model (OMD). We propose a framework for data quality evaluation and an OMD repair framework. We formally define a weight-based semantile repair by deletion and have an automated weight generation system that takes into account multiple input criteria. We use multi-criteria decisions based on the correlation, contrast and conflict between multiple criteria that are often necessary in the field of data cleaning. After weight generation, we present a Min-Sum dynamic programming algorithm to find the minimum weight solution. Then we apply evolutionary optimisation techniques and use medical datasets to show improved performance that is practically feasible.

Cite

CITATION STYLE

APA

Ramasamy, A., Sisay, B., & Bahiru, A. (2021). A Data Science Framework for Data Quality Assessment and Inconsistency Detection. International Journal of Advanced Computer Science and Applications, 12(4), 605–613. https://doi.org/10.14569/IJACSA.2021.0120476

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free