A Novel Framework and Model for Data Warehouse Cleansing

Daya Gupta; Payal Pahwa; Rajiv Arora

Journal Article

A Novel Framework and Model for Data Warehouse Cleansing

Gupta D
Pahwa P
Arora R

International Journal of Computer Applications (2011) 32(8) 6-13

N/ACitations

14Readers

Abstract

Data cleansing is a process that deals with identification of corrupt and duplicate data inherent in the data sets of a data warehouse to enhance the quality of data. This paper aims to facilitate the data cleaning process by addressing the problem of duplicate records detection pertaining to the name attributes of the data sets. It provides a sequence of algorithms through a novel framework for identifying duplicity in the name attribute of the data sets of an already existing data warehouse. The key features of the research includes its proposal of a novel framework through a well defined sequence of algorithms and refining the application of alliance rules 1 by incorporating the use of previously existing and well defined similarity computation measures. The results depicted show the feasibility and validity of the suggested method.

Author supplied keywords

Cite

CITATION STYLE

APA

Gupta, D., Pahwa, P., & Arora, R. (2011). A Novel Framework and Model for Data Warehouse Cleansing. International Journal of Computer Applications, 32(8), 6–13. Retrieved from http://research.ijcaonline.org/volume32/number8/pxc3875533.pdf

A Novel Framework and Model for Data Warehouse Cleansing

Abstract

Author supplied keywords

Cite

Register to see more suggestions