Epsilon Best Arm Identification in Spectral Bandits

N/ACitations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

We propose an analysis of Probably Approximately Correct (PAC) identification of an e-best arm in graph bandit models with Gaussian distributions. We consider finite but potentially very large bandit models where the set of arms is endowed with a graph structure, and we assume that the arms' expectations µ are smooth with respect to this graph. Our goal is to identify an arm whose expectation is at most e below the largest of all means. We focus on the fixed-confidence setting: given a risk parameter d, we consider sequential strategies that yield an e-optimal arm with probability at least 1-d. All such strategies use at least TR,e* (µ) log(1/d) samples, where R is the smoothness parameter. We identify the complexity term TR,e* (µ) as the solution of a min-max problem for which we give a game-theoretic analysis and an approximation procedure. This procedure is the key element required by the asymptotically optimal Track-and-Stop strategy.

Cite

CITATION STYLE

APA

Kocák, T., & Garivier, A. (2021). Epsilon Best Arm Identification in Spectral Bandits. In IJCAI International Joint Conference on Artificial Intelligence (pp. 2636–2642). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2021/363

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free