Sign up & Download
Sign in

Bias-Optimal Incremental Learning of Control Sequences for Virtual Robots

by Juergen Schmidhuber, Viktor P Zhumatiy, Matteo Gagliolo
Procceedings of the eigth conference on Intelligent Autonomous Systems IAS8 (2004)

Abstract

Learning and planning control is hard. The search space of traditional planners consists of sequences of primitive actions. To exploit reusable subsequences and other algorithmic regularities, however, we should instead search the general space of programs that compute action sequences. Such programs may invoke very fast thinking actions consuming only nanoseconds (such as conditional jumps to certain code addresses) as well as very slow control actions consuming seconds in the real world (such as stretch-arm-until-obstacle-sensation). What is an optimal way of allocating time to tests of such non-homogeneous programs? What is an optimal way of reusing experience with previous tasks to learn solutions to new tasks? One answer is given by the recent Optimal Ordered Problem Solver OOPS, a near-bias-optimal incremental extension of Levin's nonincremental universal search, which we apply to virtual robotics for the first time: our snake robot uses OOPS to learn to walk and jump in a partially observable environment (POMDP) with a huge state/action space.

Author-supplied keywords

Cite this document (BETA)

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

7 Readers on Mendeley
by Discipline
 
 
by Academic Status
 
57% Ph.D. Student
 
29% Student (Master)
 
14% Researcher (at an Academic Institution)
by Country
 
14% Italy
 
14% United Kingdom
 
14% China