The importance of data quality might have a major impact on the company's existing business processes. But there are still many companies that yet to understand the importance of data quality. Many cases that often occurs to the quality of data in many companies in Indonesia is that the inputted data are not filtered, so there are issues about not standardized data pattern. This case can be handled with data preprocess in which one of the methods are data profiling. Data profiling is a proses of collecting an information of a data. In this research the main focus of the analysis by conductin data profiling using data pattern method and algorithm that adopting from OpenRefine and then modified. The results of the profiling using open source tools Pentaho Data Integration, Google OpenRefine and Data Cleaner are really difference, while Pentaho Data Integration and Google OpenRefine found exactly 70 data patterns, Data Cleaner only find 31 data patterns.
CITATION STYLE
Amethyst, S. R., Kusumasari, T. F., & Hasibuan, M. A. (2018). Data Pattern Single Column Analysis for Data Profiling using an Open Source Platform. In IOP Conference Series: Materials Science and Engineering (Vol. 453). Institute of Physics Publishing. https://doi.org/10.1088/1757-899X/453/1/012024
Mendeley helps you to discover research relevant for your work.