Achieving correlated equilibrium by studying opponent's behavior through policy-based deep reinforcement learning

Kuo Chun Tsai; Zhu Han

Journal ArticleOPEN ACCESS

Achieving correlated equilibrium by studying opponent's behavior through policy-based deep reinforcement learning

IEEE Access (2020) 8 199682-199695

DOI: 10.1109/ACCESS.2020.3035362

2Citations

15Readers

Abstract

Game theory is a very profound study on distributed decision-making behavior and has been extensively developed by many scholars. However, many existing works rely on certain strict assumptions such as knowing the opponent's private behaviors, which might not be practical. In this work, we focused on two Nobel winning concepts, the Nash equilibrium, and the correlated equilibrium. We proposed a policy-based deep reinforcement learning model which instead of just learning the regions for corresponding strategies and actions, it learns why and how the rational opponent plays. With our proposed policy-based deep reinforcement learning model, we successfully reached the correlated equilibrium which maximizes the utility for each player. Depending on the scenario, the equilibrium can reach outside of the Nash equilibrium convex hull to achieve higher utility for the players, while the traditional non-regret algorithms cannot. In addition, we also proposed a mathematical model to inverse the calculation of the correlated equilibrium probability to estimate the rational opponent player's payoff. Through simulations, with limited interaction among the players, we showed that our proposed method can achieve the optimal correlated equilibrium where each player gains an equal or higher utility than the Nash equilibrium.

Author supplied keywords

Cite

CITATION STYLE

APA

Tsai, K. C., & Han, Z. (2020). Achieving correlated equilibrium by studying opponent’s behavior through policy-based deep reinforcement learning. IEEE Access, 8, 199682–199695. https://doi.org/10.1109/ACCESS.2020.3035362

Achieving correlated equilibrium by studying opponent's behavior through policy-based deep reinforcement learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions