Learning and reusing goal-specific policies for goal-driven autonomy

Ulit Jaidee; Héctor Muñoz-Avila; David W. Aha

Conference Proceedings

Learning and reusing goal-specific policies for goal-driven autonomy

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7466 LNAI 182-195

DOI: 10.1007/978-3-642-32986-9_15

12Citations

16Readers

Get full text

Abstract

In certain adversarial environments, reinforcement learning (RL) techniques require a prohibitively large number of episodes to learn a high-performing strategy for action selection. For example, Q-learning is particularly slow to learn a policy to win complex strategy games. We propose GRL, the first GDA system capable of learning and reusing goal-specific policies. GRL is a case-based goal-driven autonomy (GDA) agent embedded in the RL cycle. GRL acquires and reuses cases that capture episodic knowledge about an agent's (1) expectations, (2) goals to pursue when these expectations are not met, and (3) actions for achieving these goals in given states. Our hypothesis is that, unlike RL, GRL can rapidly fine-tune strategies by exploiting the episodic knowledge captured in its cases. We report performance gains versus a state-of-the-art GDA agent and an RL agent for challenging tasks in two real-time video game domains. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Jaidee, U., Muñoz-Avila, H., & Aha, D. W. (2012). Learning and reusing goal-specific policies for goal-driven autonomy. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7466 LNAI, pp. 182–195). https://doi.org/10.1007/978-3-642-32986-9_15

Learning and reusing goal-specific policies for goal-driven autonomy

Abstract

Author supplied keywords

Cite

Register to see more suggestions