The tasks of optimizing asset allocation considering transaction costs can be formulated into the framework of Markov Decision Pro-cesses(MDPs) and reinforcement learning. In this paper, a risk-averse reinforcement learning algorithm is proposed which improves asset allocation strategy of portfolio management systems. The proposed algorithm alternates policy evaluation phases which take into account the mean and variance of return under a given policy and policy improvement phases which follow the variance-penalized criterion. The algorithm is tested on trading systems for a single future corresponding to a Japanese stock index.
CITATION STYLE
Sato, M., & Kobayashi, S. (2000). Variance-penalized reinforcement learning for risk-averse asset allocation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1983, pp. 244–249). Springer Verlag. https://doi.org/10.1007/3-540-44491-2_34
Mendeley helps you to discover research relevant for your work.