Abstract
We consider a bandit problem consisting of a sequence of n choices from an infinite number of Bernoulli arms, with n → ∝. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution F. We first assume F to be the uniform distribution on (0, 1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between √2/√n and 2/√n and we exhibit classes of strategies that achieve the latter.
Author supplied keywords
Cite
CITATION STYLE
Berry, D. A., Chen, R. W., Zame, A., Heath, D. C., & Shepp, L. A. (1997). Bandit problems with infinitely many arms. Annals of Statistics, 25(5), 2103–2116. https://doi.org/10.1214/aos/1069362389
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.