Abstract
This paper analyses a simple epsilon-greedy exploration approach to train models with Deep Q-Learning algorithm to involve randomness that helps prevail the agent over conforming to a single solution. This allows the agent to explore different solutions for a problem even after finding a solution. This helps the agent find the global optimum solution without being stuck in a local optimum. A simple block environment is built and used to assess the agent’s ability to reach the destination, block A to reach block B. The model is trained repeatedly by feeding the game image and rewarding it based on the decisions made. The weights of the Neural Network of the Reinforcement Learning model are then adjusted by training the model after every iteration to improve the result. Furthermore, two different environments from the Gym library in Python is used to corroborate the results obtained. Here we have used TensorFlow to build and implement the model on the GPU for better and accelerated computation.
Author supplied keywords
Cite
CITATION STYLE
Hariharan, N., & Anand, P. G. (2022). A Brief Study of Deep Reinforcement Learning with Epsilon-Greedy Exploration. International Journal of Computing and Digital Systems, 11(1), 541–551. https://doi.org/10.12785/ijcds/110144
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.