Efficient Continuous Space Policy Optimization for High-frequency Trading

20Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

High-frequency trading is an extraordinarily intricate financial task, which is normally treated as a near real-time sequential decision problem. Compared with the traditional two-phase approach, forecasting equity's trend and then weighting them by combinatorial optimization, deep reinforcement learning (DRL) methods have shown advances in reward chasing with optimal policies. However, existing DRL-based methods either leverage portfolio optimization on low-frequency scenarios or only support a very limited number of assets with discrete action space, facing significant computing efficiency challenges. Therefore, we propose an efficient DRL-based policy optimization (DRPO) method for high-frequency trading. In particular, we model the portfolio management task with Markov Decision Process by directly inferring the equity weights in the action space guided by maximum accumulated returns. To reduce agents' interaction complexity without reducing interpretation, we detach the environment into the "static'' market states and "dynamic'' portfolio weight states. Then, we design an efficient reward expectation calculation algorithm via probabilistic dynamic programming, which enables our agents directly collect feedback away from trajectory sampling-based morass. To the best of our knowledge, this is the first work that solves the high-frequency portfolio optimization problem by devising an efficient continuous space policy optimization algorithm in the DRL framework. Through extensive experiments on the real-world data from Dow Jones, Coinbase and SSE exchanges, we show that our proposed DRPO significantly outperforms state-of-the-art benchmark methods. The results demonstrate the practical applicability and effectiveness of the proposed method.

Cite

CITATION STYLE

APA

Han, L., Ding, N., Wang, G., Cheng, D., & Liang, Y. (2023). Efficient Continuous Space Policy Optimization for High-frequency Trading. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 4112–4122). Association for Computing Machinery. https://doi.org/10.1145/3580305.3599813

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free