We consider a reinforcement learning setting where the learner is given a set of possible models containing the true model. While there are algorithms that are able to successfully learn optimal behavior in this setting, they do so without trying to identify the underlying true model. Indeed, we show that there are cases in which the attempt to find the true model is doomed to failure.
CITATION STYLE
Ortner, R. (2016). Optimal Behavior is Easier to Learn than the Truth. Minds and Machines, 26(3), 243–252. https://doi.org/10.1007/s11023-016-9389-y
Mendeley helps you to discover research relevant for your work.