We study mining incomplete data sets with two interpretations of missing attribute values, lost values and “do not care” conditions. For data mining we use characteristic sets and generalized maximal consistent blocks. Additionally, we use three types of probabilistic approximations, lower, middle and upper, so altogether we apply six approaches to data mining. Since it was shown that an error rate, associated with such data mining is not universally smaller for any approach, we decided to compare complexity of induced rule sets. Therefore, our objective is to compare six approaches to mining incomplete data sets in terms of complexity of induced rule sets. We conclude that there are statistically significant differences between these approaches.
CITATION STYLE
Clark, P. G., Gao, C., Grzymala-Busse, J. W., Mroczek, T., & Niemiec, R. (2018). Complexity of rule sets induced by characteristic sets and generalized maximal consistent blocks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10842 LNAI, pp. 301–310). Springer Verlag. https://doi.org/10.1007/978-3-319-91262-2_27
Mendeley helps you to discover research relevant for your work.