Ideas for a reinforcement learning algorithm that learns programs

3Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Conventional reinforcement learning algorithms such as Q-learning are not good at learning complicated procedures or programs because they are not designed to do that. AIXI, which is a general framework for reinforcement learning, can learn programs as the environment model, but it is not computable. AIXI has a computable and computationally tractable approximation, MC-AIXI(FAC-CTW), but it models the environment not as programs but as a trie, and still has not resolved the trade-off between exploration and exploitation within a realistic amount of computation. This paper presents our research idea for realizing an efficient reinforcement learning algorithm that retains the property of modeling the environment as programs. It also models the policy as programs and has the ability to imitate other agents in the environment. The design policy of the algorithm has two points: (1) the ability to program is indispensable for human-level intelligence, and (2) a realistic solution to the exploration/exploitation trade-off is teaching via imitation.

Author supplied keywords

Cite

CITATION STYLE

APA

Katayama, S. (2016). Ideas for a reinforcement learning algorithm that learns programs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9782, pp. 354–362). Springer Verlag. https://doi.org/10.1007/978-3-319-41649-6_36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free