Abstract
Compression measures used in inductive learners, such as measures based on the minimum description length principle, can be used as a basis for grading candidate hypotheses. Compression-based induction is suited also for handling noisy data. This paper shows that a simple compression measure can be used to detect noisy training examples, where noise is due to random classification errors. A technique is proposed in which noisy examples are detected and eliminated from the training set, and a hypothesis is then built from the set of remaining examples. This noise elimination method was applied to preprocess data for four machine-learning algorithms, and evaluated on selected medical domains. © 2000 Taylor and Francis Group, LLC.
Cite
CITATION STYLE
Gamberger, D., Lavrac, N., & Dzeroski, S. (2000). Noise detection and elimination in data preprocessing: experiments in medical domains. Applied Artificial Intelligence, 14(2), 205–223. https://doi.org/10.1080/088395100117124
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.