Optimal Decision Tree Policies for Markov Decision Processes

Daniël Vos; Sicco Verwer

Conference ProceedingsOPEN ACCESS

Optimal Decision Tree Policies for Markov Decision Processes

IJCAI International Joint Conference on Artificial Intelligence (2023) 2023-August 5457-5465

DOI: 10.24963/ijcai.2023/606

4Citations

9Readers

Abstract

Interpretability of reinforcement learning policies is essential for many real-world tasks but learning such interpretable policies is a hard problem. Particularly, rule-based policies such as decision trees and rules lists are difficult to optimize due to their non-differentiability. While existing techniques can learn verifiable decision tree policies, there is no guarantee that the learners generate a policy that performs optimally. In this work, we study the optimization of size-limited decision trees for Markov Decision Processes (MPDs) and propose OMDTs: Optimal MDP Decision Trees. Given a user-defined size limit and MDP formulation, OMDT directly maximizes the expected discounted return for the decision tree using Mixed-Integer Linear Programming. By training optimal tree policies for different MDPs we empirically study the optimality gap for existing imitation learning techniques and find that they perform sub-optimally. We show that this is due to an inherent shortcoming of imitation learning, namely that complex policies cannot be represented using size-limited trees. In such cases, it is better to directly optimize the tree for expected return. While there is generally a trade-off between the performance and interpretability of machine learning models, we find that on small MDPs, depth 3 OMDTs often perform close to optimally.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Vos, D., & Verwer, S. (2023). Optimal Decision Tree Policies for Markov Decision Processes. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2023-August, pp. 5457–5465). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2023/606

Readers' Seniority

PhD / Post grad / Masters / Doc 3

50%

Researcher 2

33%

Lecturer / Post doc 1

17%

Readers' Discipline

Computer Science 4

67%

Engineering 1

17%

Business, Management and Accounting 1

17%

Optimal Decision Tree Policies for Markov Decision Processes

Abstract

References Powered by Scopus

Induction of Decision Trees

"Why should i trust you?" Explaining the predictions of any classifier

Markov decision processes: Discrete stochastic dynamic programming

Cited by Powered by Scopus

Policies Grow on Trees: Model Checking Families of MDPs

An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty

A Novel Tree-Based Method for Interpretable Reinforcement Learning

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline