The paper aims to give a unified account of the central concepts in recent work on bandit processes and dynamic allocation indices; to show how these reduce some previously intractable problems to the problem of calculating such indices; and to describe how these calculations may be carried out. Applications to stochastic scheduling, sequential clinical trials and a class of search problems are discussed.
Gittins, J. C. (1979). Bandit Processes and Dynamic Allocation Indices. Journal of the Royal Statistical Society: Series B (Methodological), 41(2), 148–164. https://doi.org/10.1111/j.2517-6161.1979.tb01068.x