Achieving correlated equilibrium by studying opponent's behavior through policy-based deep reinforcement learning

2Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Game theory is a very profound study on distributed decision-making behavior and has been extensively developed by many scholars. However, many existing works rely on certain strict assumptions such as knowing the opponent's private behaviors, which might not be practical. In this work, we focused on two Nobel winning concepts, the Nash equilibrium, and the correlated equilibrium. We proposed a policy-based deep reinforcement learning model which instead of just learning the regions for corresponding strategies and actions, it learns why and how the rational opponent plays. With our proposed policy-based deep reinforcement learning model, we successfully reached the correlated equilibrium which maximizes the utility for each player. Depending on the scenario, the equilibrium can reach outside of the Nash equilibrium convex hull to achieve higher utility for the players, while the traditional non-regret algorithms cannot. In addition, we also proposed a mathematical model to inverse the calculation of the correlated equilibrium probability to estimate the rational opponent player's payoff. Through simulations, with limited interaction among the players, we showed that our proposed method can achieve the optimal correlated equilibrium where each player gains an equal or higher utility than the Nash equilibrium.

Cite

CITATION STYLE

APA

Tsai, K. C., & Han, Z. (2020). Achieving correlated equilibrium by studying opponent’s behavior through policy-based deep reinforcement learning. IEEE Access, 8, 199682–199695. https://doi.org/10.1109/ACCESS.2020.3035362

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free