The explore-exploit dilemma in nonstationary decision making under uncertainty

Allan Axelrod; Girish Chowdhary

Book Chapter

The explore-exploit dilemma in nonstationary decision making under uncertainty

Springer International Publishing, (2015), 29-52

DOI: 10.1007/978-3-319-26327-4_2

2Citations

7Readers

Get full text

Abstract

It is often assumed that autonomous systems are operating in environments that may be described by a stationary (time-invariant) environment. However, real-world environments are often nonstationary (time-varying), where the underlying phenomena changes in time, so stationary approximations of the nonstationary environment may quickly lose relevance. Here, two approaches are presented and applied in the context of reinforcement learning in nonstationary environments. In Sect. 2.2, the first approach leverages reinforcement learning in the presence of a changing reward-model. In particular, a functional termed the Fog-of-War is used to drive exploration which results in the timely discovery of new models in nonstationary environments. In Sect. 2.3, the Fog-of-War functional is adapted in real-time to reflect the heterogeneous information content of a real-world environment; this is critically important for the use of the approach in Sect. 2.2 in realworld environments.

Cite

CITATION STYLE

APA

Axelrod, A., & Chowdhary, G. (2015). The explore-exploit dilemma in nonstationary decision making under uncertainty. In Studies in Systems, Decision and Control (Vol. 42, pp. 29–52). Springer International Publishing. https://doi.org/10.1007/978-3-319-26327-4_2

The explore-exploit dilemma in nonstationary decision making under uncertainty

Abstract

Cite

Register to see more suggestions