Reinforcement learning aims at maximising an external evaluative signal over a certain time horizon. If no reward is available within the time horizon, the agent faces an autonomous learning task which can be used to explore, to gather information, and to bootstrap particular learning behaviours. We discuss here how the agent can use a current representation of the value, of its state and of the environment, in order to produce autonomous learning behaviour in the absence of a meaningful rewards. The family of methods that is introduced here is open to further development and research in the field of reflexive reinforcement learning.
CITATION STYLE
Lyons, B. I., & Herrmann, J. M. (2020). Reflexive Reinforcement Learning: Methods for Self-Referential Autonomous Learning. In International Joint Conference on Computational Intelligence (Vol. 1, pp. 381–388). Science and Technology Publications, Lda. https://doi.org/10.5220/0009997503810388
Mendeley helps you to discover research relevant for your work.