Statistical Query Learning (1998; Kearns)

  • Feldman V
N/ACitations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

/view.php?person=us-vitaly entry editor: Rocco A. Servedio INDEX TERMS: Statistical query, PAC learning, classification noise, noise-tolerant learning, SQ dimension. 1 PROBLEM DEFINITION The problem deals with learning to classify from random labeled examples in Valiant's PAC model [Val84]. In the random classification noise model of Angluin and Laird [AL88] the label of each example given to the learning algorithm is flipped randomly and independently with some fixed probability η called the noise rate. Robustness to such benign form of noise is an important goal in the design of learning algorithms. Kearns defined a powerful and convenient framework for constructing noise-tolerant algorithms based on statistical queries. Statistical query (SQ) learning is a natural restriction of PAC learning that models algorithms that use statistical properties of a data set rather than individual examples. Kearns demonstrated that any learning algorithm that is based on statistical queries can be automatically converted to a learning algorithm in the presence of random classification noise of arbitrary rate smaller than the information-theoretic barrier of 1/2. This result was used to give the first noise-tolerant algorithm for a number of important learning problems. In fact, virtually all known noise-tolerant PAC algorithms were either obtained from SQ algorithms or can be easily cast into the SQ model. In subsequent work the model of Kearns has been extended to other settings and found a number of additional applications in machine learning and theoretical computer science. 1.1 Definitions and Notation Let C be a class of {−1, +1}-valued functions (also called concepts) over an input space X. In the basic PAC model a learning algorithm is given examples of an unknown function f from C on points randomly chosen from some unknown distribution D over X and should produce a hypothesis h that approximates f. More formally, an example oracle EX(f, D) is an oracle that upon being invoked returns an example x, f (x), where x is chosen randomly with respect to D, independently of any previous examples. A learning algorithm for C is an algorithm that for every > 0, δ > 0, f ∈ C, and distribution D over X, given , δ, and access to EX(f, D) outputs, with probability at least 1 − δ, a hypothesis h that-approximates f with respect to D (i.e. Pr D [f (x) = h(x)] ≤). We denote this distribution over labeled examples by D f. Efficient learning algorithms are algorithms that run in time polynomial in 1//, 1/δ, and the size of the learning problem s. The size of a learning problem is determined by the description length of f under some fixed representation 1

Cite

CITATION STYLE

APA

Feldman, V. (2014). Statistical Query Learning (1998; Kearns). In Encyclopedia of Algorithms (pp. 1–7). Springer US. https://doi.org/10.1007/978-3-642-27848-8_401-2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free