In this paper, we propose a new measure within the framework of reinforcement learning, by describing a model of an information source as a representation of a learning process. We confirm in experiments that Lempel-Ziv coding for a string of episode sequences provides a quality measure to describe the degree of complexity for learning. In addition, we discuss functions comparing expected return and its variance.
CITATION STYLE
Iwata, K., & Ishii, N. (2002). Lempel-Ziv coding in reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2412, pp. 531–537). Springer Verlag. https://doi.org/10.1007/3-540-45675-9_80
Mendeley helps you to discover research relevant for your work.