We have indicated several times that the two most widely studied discount sequences are the n-horizon uniform and the geometric. The former applies when the problem is to maximize the sum of n observations. When n is unknown the corresponding random discount sequence can be taken to be nonrandom (see Section 3.1); it can be any nonincreasing sequence depending on the uncertainty in n. As a special case suppose n has a geometric distribution; so the opportunity for gain ceases at each stage with constant probability α. Then, and in many other circumstances as well, the appropriate discount sequence is geometric: A = (1, α, α2, α3,...).
CITATION STYLE
Berry, D. A., & Fristedt, B. (1985). Many independent arms; geometric discounting. In Bandit problems (pp. 136–149). Springer Netherlands. https://doi.org/10.1007/978-94-015-3711-7_6
Mendeley helps you to discover research relevant for your work.