This paper studies the spectrum allocation and power control (SA-PC) problem in device-to-device (D2D) communication underlaying a cellular network. A distributed multi-agent reinforcement learning (MARL) based joint SA-PC algorithm is proposed for performing spectrum allocation and power control for each D2D user in the network. The proposed algorithm uses Q learning, a typical form of reinforcement learning (RL), to select the optimal resource block (RB) and power level for each D2D user. In the Q-learning algorithm, each D2D user is treated as an individual agent and maintains a single-state Q table. Each agent selects an RB and a power level according to its Q table in the learning process. Simulation results show that the proposed Q-learning based joint SA-PC algorithm can achieve good throughput performance.
CITATION STYLE
Chen, W., & Zheng, J. (2019). A Reinforcement Learning Based Joint Spectrum Allocation and Power Control Algorithm for D2D Communication Underlaying Cellular Networks. In Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST (Vol. 286, pp. 146–158). Springer Verlag. https://doi.org/10.1007/978-3-030-22968-9_13
Mendeley helps you to discover research relevant for your work.