In this paper, we investigate the structural similarities within a finite Markov decision process (MDP). We view a finite MDP as a heterogeneous directed bipartite graph and propose novel measures for the state and action similarities, in a mutually reinforced manner. We prove that the state similarity is a metric and the action similarity is a pseudometric. We also establish the connection between the proposed similarity measures and the optimal values of the MDP. Extensive experiments show that the proposed measures are effective.
CITATION STYLE
Wang, H., Dong, S., & Shao, L. (2019). Measuring structural similarities in finite MDPs. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 3684–3690). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/511
Mendeley helps you to discover research relevant for your work.