Discretization is one of the most important parts of decision table preprocessing. Transforming continuous values of attributes into discrete intervals influences further analysis using data mining methods. In particular, the accuracy of generated predictions is highly dependent on the quality of discretization. The paper contains a description of three new heuristic algorithms for discretization of numeric data, based on Boolean reasoning. Additionally, an entropy-based evaluation of discretization is introduced to compare the results of the proposed algorithms with the results of leading university software for data analysis. Considering the discretization as a data compression method, the average compression ratio achieved for databases examined in the paper is 8.02 while maintaining the consistency of databases at 100%.
CITATION STYLE
Jankowski, C., Reda, D., Mańkowski, M., & Borowik, G. (2015). Discretization of data using Boolean transformations and information theory based evaluation criteria. Bulletin of the Polish Academy of Sciences: Technical Sciences, 63(4), 923–932. https://doi.org/10.1515/bpasts-2015-0105
Mendeley helps you to discover research relevant for your work.