Learning and reusing goal-specific policies for goal-driven autonomy

12Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In certain adversarial environments, reinforcement learning (RL) techniques require a prohibitively large number of episodes to learn a high-performing strategy for action selection. For example, Q-learning is particularly slow to learn a policy to win complex strategy games. We propose GRL, the first GDA system capable of learning and reusing goal-specific policies. GRL is a case-based goal-driven autonomy (GDA) agent embedded in the RL cycle. GRL acquires and reuses cases that capture episodic knowledge about an agent's (1) expectations, (2) goals to pursue when these expectations are not met, and (3) actions for achieving these goals in given states. Our hypothesis is that, unlike RL, GRL can rapidly fine-tune strategies by exploiting the episodic knowledge captured in its cases. We report performance gains versus a state-of-the-art GDA agent and an RL agent for challenging tasks in two real-time video game domains. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Jaidee, U., Muñoz-Avila, H., & Aha, D. W. (2012). Learning and reusing goal-specific policies for goal-driven autonomy. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7466 LNAI, pp. 182–195). https://doi.org/10.1007/978-3-642-32986-9_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free