Bandit problems with infinitely many arms

Donald A. Berry; Robert W. Chen; Alan Zame; David C. Heath; Larry A. Shepp

Journal ArticleOPEN ACCESS

Bandit problems with infinitely many arms

Annals of Statistics (1997) 25(5) 2103-2116

DOI: 10.1214/aos/1069362389

80Citations

39Readers

Abstract

We consider a bandit problem consisting of a sequence of n choices from an infinite number of Bernoulli arms, with n → ∝. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution F. We first assume F to be the uniform distribution on (0, 1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between √2/√n and 2/√n and we exhibit classes of strategies that achieve the latter.

Author supplied keywords

Cite

CITATION STYLE

APA

Berry, D. A., Chen, R. W., Zame, A., Heath, D. C., & Shepp, L. A. (1997). Bandit problems with infinitely many arms. Annals of Statistics, 25(5), 2103–2116. https://doi.org/10.1214/aos/1069362389

Bandit problems with infinitely many arms

Abstract

Author supplied keywords

Cite

Register to see more suggestions