Traditional Reinforcement Learning methods are insufficient for AGIs who must be able to learn to deal with Partially Observable Markov Decision Processes. We investigate a novel method for dealing with this problem: standard RL techniques using as input the hidden layer output of a Sequential Constant-Size Compressor (SCSC). The SCSC takes the form of a sequential Recurrent Auto-Associative Memory, trained through standard back-propagation. Results illustrate the feasibility of this approach - this system learns to deal with high-dimensional visual observations (up to 640 pixels) in partially observable environments where there are long time lags (up to 12 steps) between relevant sensory information and necessary action. © 2011 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Gisslén, L., Luciw, M., Graziano, V., & Schmidhuber, J. (2011). Sequential constant size compressors for reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6830 LNAI, pp. 31–40). https://doi.org/10.1007/978-3-642-22887-2_4
Mendeley helps you to discover research relevant for your work.