Associative Classification (AC) is a well known tool in knowledge discovery and it has been proved to extract competitive classifiers. However, imbalanced data has posed a challenge for most classifier learning algorithms including AC methods. Because in the AC process, Interestingness Measure (IM) plays an important role to generate interesting rules and build good classifiers, it is very important to select IMs for improving AC’s performance in the context of imbalanced data. In this paper, we aim at improving AC’s performance on imbalanced data through studying IMs. To achieve this, there are two main tasks to be settled. The first one is to find which measures have similar behaviors on imbalanced data. The second is to select appropriate measures. We evaluate each measure’s performance by AUC which is usually used for evaluation of imbalanced data classification. Firstly, based on the performances, we propose a frequent correlated patterns mining method to extract stable clusters in which the IMs have similar behaviors. Secondly, we find 26 proper measures for imbalanced data after the IM ranking computation method and divide them into two groups with one especially for extremely imbalanced data and the other suitable for slightly imbalanced data.
CITATION STYLE
Yang, G., & Cui, X. (2015). A study of interestingness measures for associative classification on imbalanced data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9441, pp. 141–151). Springer Verlag. https://doi.org/10.1007/978-3-319-25660-3_12
Mendeley helps you to discover research relevant for your work.