Learning from demonstration using MDP induced metrics

Francisco S. Melo; Manuel Lopes

Conference ProceedingsOPEN ACCESS

Learning from demonstration using MDP induced metrics

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6322 LNAI(PART 2) 385-401

DOI: 10.1007/978-3-642-15883-4_25

13Citations

43Readers

Abstract

In this paper we address the problem of learning a policy from demonstration. Assuming that the policy to be learned is the optimal policy for an underlying MDP, we propose a novel way of leveraging the underlying MDP structure in a kernel-based approach. Our proposed approach rests on the insight that the MDP structure can be encapsulated into an adequate state-space metric. In particular we show that, using MDP metrics, we are able to cast the problem of learning from demonstration as a classification problem and attain similar generalization performance as methods based on inverse reinforcement learning at a much lower online computational cost. Our method is also able to attain superior generalization than other supervised learning methods that fail to consider the MDP structure. © 2010 Springer-Verlag Berlin Heidelberg.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Melo, F. S., & Lopes, M. (2010). Learning from demonstration using MDP induced metrics. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6322 LNAI, pp. 385–401). https://doi.org/10.1007/978-3-642-15883-4_25

Readers' Seniority

PhD / Post grad / Masters / Doc 25

71%

Professor / Associate Prof. 5

14%

Researcher 5

14%

Readers' Discipline

Computer Science 23

66%

Engineering 10

29%

Nursing and Health Professions 1

Psychology 1

Learning from demonstration using MDP induced metrics

Abstract

References Powered by Scopus

A survey of robot learning from demonstration

Apprenticeship learning via inverse reinforcement learning

Equivalence notions and model minimization in Markov decision processes

Cited by Powered by Scopus

A survey of inverse reinforcement learning: Challenges, methods and progress

A survey of inverse reinforcement learning techniques

Bridging the gap between imitation learning and inverse reinforcement learning

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline