On the sample complexity of weak learning

7Citations
Citations of this article
131Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this paper, we study the sample complexity of weak learning. That is, we ask how many data must be collected from an unknown distribution in order to extract a small but significant advantage in prediction. We show that it is important to distinguish between those learning algorithms that output deterministic hypotheses and those that output randomized hypotheses. We prove that in the weak learning model, any algorithm using deterministic hypotheses to weakly learn a class of Vapnik-Chervonenkis dimension d(n) requires Ω(√ d(n)) examples. In contrast, when randomized hypotheses are allowed, we show that Θ(1) examples suffice in some cases. We then show that there exists an efficient algorithm using deterministic hypotheses that weakly learns against any distribution on a set of size d(n) with only O(d(n)2/3) examples. Thus for the class of symmetric Boolean functions over n variables, where the strong learning sample complexity is Θ(n), the sample complexity for weak learning using deterministic hypotheses is Ω(√ n) and O(n2/3), and the sample complexity for weak learning using randomized hypotheses is Θ(1). Next we prove the existence of classes for which the distribution-free sample size required to obtain a slight advantage in prediction over random guessing is essentially equal to that required to obtain arbitrary accuracy. Finally, for a class of small circuits, namely all parity functions of subsets of n Boolean variables, we prove a weak learning sample complexity of Θ(n). This bound holds even if the weak learning algorithm is allowed to replace random sampling with membership queries, and the target distribution is uniform on (0, 1)n. © 1995 Academic Press, Inc.

Cite

CITATION STYLE

APA

Goldman, S. A., Kearns, M. J., & Schapire, R. E. (1995). On the sample complexity of weak learning. Information and Computation, 117(2), 276–287. https://doi.org/10.1006/inco.1995.1045

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free