Data consistency is one of the central issues of data quality management. Content-related conditional functional dependencies (CCFDs) are practical techniques for data consistency. CCFDs catch inconsistencies by putting content-related data together. Specially, repairing sequence plays a key role in consistency repairing. Some repairing sequences may bring unexpected results (e.g., incorrect repairs and results with extra repairing-cost). Hence, reasonable repairing sequences are advocated and readily supported by commercial system for better performance. To meet this need, this paper present a method of determining repairing sequence of inconsistencies in content-related data. (1) We present repairing sequence graph about CCFDs to select the inconsistencies which should be repaired preferentially. (2) We analyze the repairing mutex and discuss the interaction between repairing sequence and repairing mutex. (3) We proof that the problem of determining repairing sequence with minimum repairing-cost is NP-complete so that our method heuristically finds the appropriate repairing sequence. Our solution performs to be effective by empirical evaluation on three datasets.
CITATION STYLE
Du, Y., Shen, D., Nie, T., Kou, Y., & Yu, G. (2017). Determining repairing sequence of inconsistencies in content-related data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10569 LNCS, pp. 524–539). Springer Verlag. https://doi.org/10.1007/978-3-319-68783-4_36
Mendeley helps you to discover research relevant for your work.