In this paper, we propose a rule management system for data cleaning that is based on knowledge. This sys-tem combines features of both rule based systems and rule based data cleaning frameworks. The important advantages of our system are threefold. First, it aims at proposing a strong and unified rule form based on first order structure that permits the representation and management of all the types of rules and their quality via some characteristics. Second, it leads to increase the quality of rules which conditions the quality of data cleaning. Third, it uses an appropriate knowledge acquisition process, which is the weakest task in the cur-rent rule and knowledge based systems. As several research works have shown that data cleaning is rather driven by domain knowledge than by data, we have identified and analyzed the properties that distinguish knowledge and rules from data for better determining the most components of the proposed system. In order to illustrate our system, we also present a first experiment with a case study at health sector where we dem-onstrate how the system is useful for the improvement of data quality. The autonomy, extensibility and plat-form-independency of the proposed rule management system facilitate its incorporation in any system that is interested in data quality management.
CITATION STYLE
BRADJI, L., & BOUFAIDA, M. (2011). A Rule Management System for Knowledge Based Data Cleaning. Intelligent Information Management, 03(06), 230–239. https://doi.org/10.4236/iim.2011.36028
Mendeley helps you to discover research relevant for your work.