Reinforcement Learning

0Citations
Citations of this article
1.0kReaders
Mendeley users who have this article in their library.
Get full text

Abstract

This chapter briefly reviews some fundamental concepts, standard problem formulations, and classical algorithms of reinforcement learning (RL). Specifically, we first review Markov decision processes (MDPs) and dynamic programming (DP), which provide mathematical foundations for both the problem formulation and algorithm design for RL. Then we review some classical RL algorithms, such as Q-learning, Sarsa, policy gradient, and Thompson sampling. Finally, we provide a high-level review of the exploration schemes in RL and approximate solution methods for large-scale RL problems. At the end of this chapter, we also provide some pointers for further reading.

Cite

CITATION STYLE

APA

Wen, Z. (2022). Reinforcement Learning. In Springer Series in Supply Chain Management (Vol. 18, pp. 15–48). Springer Nature. https://doi.org/10.1007/978-3-031-01926-5_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free