We consider the problem of acquiring causal representations and concepts in a reinforcement learning setting. Our approach defines a causal variable as being both manipulable by a policy, and able to predict the outcome. We thereby obtain a parsimonious causal graph in which interventions occur at the level of policies. The approach avoids defining a generative model of the data, prior pre-processing, or learning the transition kernel of the Markov decision process. Instead, causal variables and policies are determined by maximizing a new optimization target inspired by mediation analysis, which differs from the expected return. The maximization is accomplished using a generalization of Bellman’s equation which is shown to converge, and the method finds meaningful causal representations in a simulated environment.
CITATION STYLE
Herlau, T., & Larsen, R. (2022). Reinforcement Learning of Causal Variables Using Mediation Analysis. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 (Vol. 36, pp. 6910–6917). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v36i6.20648
Mendeley helps you to discover research relevant for your work.