In this chapter, we will take the idea of the policy-gradient-based REINFORCE with baseline algorithm further and combine that idea with the value-estimation ideas from the DQN, thus, bringing the best of both worlds together in the form of the Actor-Critic...
CITATION STYLE
Sewak, M. (2019). Actor-Critic Models and the A3C. In Deep Reinforcement Learning (pp. 141–152). Springer Singapore. https://doi.org/10.1007/978-981-13-8285-7_11
Mendeley helps you to discover research relevant for your work.