Why is rule learning optimistic and how to correct it

13Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In their search through a huge space of possible hypotheses, rule induction algorithms compare estimations of qualities of a large number of rules to find the one that appears to be best. This mechanism can easily find random patterns in the data which will - even though the estimating method itself may be unbiased (such as relative frequency) - have optimistically high quality estimates. It is generally believed that the problem, which eventually leads to overfilling, can be alleviated by using m-estimate of probability. We show that this can only partially mend the problem, and propose a novel solution to making the common rule evaluation functions account for multiple comparisons in the search. Experiments on artificial data sets and data sets from the UCI repository show a large improvement in accuracy of probability predictions and also a decent gain in AUC of the constructed models. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Možina, M., Demšar, J., Žabkar, J., & Bratko, I. (2006). Why is rule learning optimistic and how to correct it. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4212 LNAI, pp. 330–340). Springer Verlag. https://doi.org/10.1007/11871842_33

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free