Deep Reinforcement Learning (Deep RL) is applied to many areas where an agent learns how to interact with the environment to achieve a certain goal, such as video game plays and robot controls. Deep RL exploits a DNN to eliminate the need for handcrafted feature engineering that requires prior domain knowledge. The Asynchronous Advantage Actor-Critic (A3C) is one of the state-of-the-art Deep RL methods. In this paper, we present an FPGA-based A3C Deep RL platform, called FA3C. Traditionally, FPGA-based DNN accelerators have mainly focused on inference only by exploiting fixedpoint arithmetic. Our platform targets both inference and training using single-precision floating-point arithmetic. We demonstrate the performance and energy efficiency of FA3C using multiple A3C agents that learn the control policies of six Atari 2600 games. Its performance is better than a high-end GPU-based platform (NVIDIA Tesla P100). FA3C achieves 27.9% better performance than that of a state-ofthe- art GPU-based implementation. Moreover, the energy efficiency of FA3C is 1.62× better than that of the GPU-based implementation.
CITATION STYLE
Cho, H., Oh, P., Park, J., Jung, W., & Lee, J. (2019). FA3C: FPGA-Accelerated Deep Reinforcement Learning. In International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS (pp. 499–513). Association for Computing Machinery. https://doi.org/10.1145/3297858.3304058
Mendeley helps you to discover research relevant for your work.