We propose an analysis of Probably Approximately Correct (PAC) identification of an e-best arm in graph bandit models with Gaussian distributions. We consider finite but potentially very large bandit models where the set of arms is endowed with a graph structure, and we assume that the arms' expectations µ are smooth with respect to this graph. Our goal is to identify an arm whose expectation is at most e below the largest of all means. We focus on the fixed-confidence setting: given a risk parameter d, we consider sequential strategies that yield an e-optimal arm with probability at least 1-d. All such strategies use at least TR,e* (µ) log(1/d) samples, where R is the smoothness parameter. We identify the complexity term TR,e* (µ) as the solution of a min-max problem for which we give a game-theoretic analysis and an approximation procedure. This procedure is the key element required by the asymptotically optimal Track-and-Stop strategy.
CITATION STYLE
Kocák, T., & Garivier, A. (2021). Epsilon Best Arm Identification in Spectral Bandits. In IJCAI International Joint Conference on Artificial Intelligence (pp. 2636–2642). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2021/363
Mendeley helps you to discover research relevant for your work.