Verifying the Gaming Strategy of Self-learning Game by Using PRISM-Games

1Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Reinforcement Learning (RL) gained a huge amount of popularity in computer science; applied in fields such as gaming, intelligent robots, remote sensing, and so on. The objective of reinforcement learning is to generate the optimal policy. The main problem of that optimal policy is that it is not fully guaranteed to be satisfied all the system specifications. Model checking is a technique to verify the system to meet the system specifications. PRISM-games is one of the model-checking tools that is used to verify the probabilistic system with competitive or collaborative behavior. Safe Reinforcement Learning via Shielding is a method that uses shield to restrict the action of the RL agent if it violates the specification using temporal logic. This paper presents to compare the winning strategies between three agents; Monte-Carlo Tree Search agent (MCTS), RL agent and shielded RL agent (SRL) which uses PRISM-games to restrict the action based on Tic-Tac-Toe game. Over thousand times of simulations has been made, the experiments show that MCTS agent has the highest win rate compared to other agents, but the losing rate of the shielded agent is reduced by using PRISM-games.

Cite

CITATION STYLE

APA

Zaw, H. H., & Hlaing, S. Z. (2020). Verifying the Gaming Strategy of Self-learning Game by Using PRISM-Games. In Advances in Intelligent Systems and Computing (Vol. 1072, pp. 148–159). Springer. https://doi.org/10.1007/978-3-030-33585-4_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free