Model-free IRL using maximum likelihood estimation

15Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.

Abstract

The problem of learning an expert's unknown reward function using a limited number of demonstrations recorded from the expert's behavior is investigated in the area of inverse reinforcement learning (IRL). To gain traction in this challenging and underconstrained problem, IRL methods predominantly represent the reward function of the expert as a linear combination of known features. Most of the existing IRL algorithms either assume the availability of a transition function or provide a complex and inefficient approach to learn it. In this paper, we present a model-free approach to IRL, which casts IRL in the maximum likelihood framework. We present modifications of the model-free Q-learning that replace its maximization to allow computing the gradient of the Q-function. We use gradient ascent to update the feature weights to maximize the likelihood of expert's trajectories. We demonstrate on two problem domains that our approach improves the likelihood compared to previous methods.

Cite

CITATION STYLE

APA

Jain, V., Doshi, P., & Banerjee, B. (2019). Model-free IRL using maximum likelihood estimation. In 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 (pp. 3951–3958). AAAI Press. https://doi.org/10.1609/aaai.v33i01.33013951

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free