In multi-period trading with realistic market impact, determining the dynamic trading strategy that optimizes expected utility of final wealth is a hard problem. In this paper we show that, with an appropriate choice of the reward function, reinforcement learning techniques (specifically, Q-learning) can successfully handle the risk-averse case. We provide a proof of concept in the form of a simulated market which permits a statistical arbitrage even with trading costs. The Q-learning agent finds and exploits this arbitrage.
CITATION STYLE
Raeisi, S., & Raeisi, S. (2023). Machine Learning for Physicists. Machine Learning for Physicists. IOP Publishing. https://doi.org/10.1088/978-0-7503-4957-4
Mendeley helps you to discover research relevant for your work.