Learning noisy linear classifiers via adaptive and selective sampling

21Citations
Citations of this article
36Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We introduce efficient margin-based algorithms for selective sampling and filtering in binary classification tasks. Experiments on real-world textual data reveal that our algorithms perform significantly better than popular and similarly efficient competitors. Using the so-called Mammen-Tsybakov low noise condition to parametrize the instance distribution, and assuming linear label noise, we show bounds on the convergence rate to the Bayes risk of a weaker adaptive variant of our selective sampler. Our analysis reveals that, excluding logarithmic factors, the average risk of this adaptive sampler converges to the Bayes risk at rate N -(1+α)(2+α)/2(3+α) where N denotes the number of queried labels, and α>0 is the exponent in the low noise condition. For all $\alpha>\sqrt{3}-1\approx0.73$ this convergence rate is asymptotically faster than the rate N -(1+α)/(2+α) achieved by the fully supervised version of the base selective sampler, which queries all labels. Moreover, for α→∞ (hard margin condition) the gap between the semi- and fully-supervised rates becomes exponential. © 2010 The Author(s).

Cite

CITATION STYLE

APA

Cavallanti, G., Cesa-Bianchi, N., & Gentile, C. (2011). Learning noisy linear classifiers via adaptive and selective sampling. Machine Learning, 83(1), 71–102. https://doi.org/10.1007/s10994-010-5191-x

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free