Hierarchical reinforcement learning based on subgoal discovery and subpolicy specialization

  • Bakker B
  • Schmidhuber J
  • 114


    Mendeley users who have this article in their library.
  • N/A


    Citations of this article.


We introduce a new method for hierarchical reinforcement learning. High- level policies automatically discover subgoals; low-level policies learn to specialize on different subgoals. Subgoals are represented as desired abstract observations which cluster raw input data. High-level value functions cover the state space at a coarse level; low-level value functions cover only parts of the state space at a fine-grained level. Experiments showthat this method outperforms several flat reinforcement learn- ing methods in a deterministic task and in a stochastic task.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


  • Bram Bakker

  • J Schmidhuber

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free