Damped Sinusoidal Exploration Decay Schedule to Improve Deep Q-Networks-Based Agent Performance

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Environments in which the rewards are sparse or occur rarely prove to be some of the most challenging for reinforcement learning (RL) agents and can take a long time to train. The decision between attempting to gain new experience, or using the knowledge already acquired to perform what is thought to be the optimal action in a given state, is computed by the exploration factor. This problem has been debated as the exploration–exploitation trade-off, which simply means choosing between what the agent already knows, and new things the agent can potentially discover. In this paper, an attempt is made to construct a new sinusoidal ε-decay function, which steadily decreases the exploration factor and increases performance. This approach is inspired from the sinusoidal curve which causes an “oscillation” of values. It is expected to stabilize and speed up a DQN-based agent’s learning process, preventing it from diverging even after long periods. The new exploration decay equation can be adopted for other RL algorithms as well.

Cite

CITATION STYLE

APA

Nilesh Sanghvi, H., & Chamundeeswari, G. (2020). Damped Sinusoidal Exploration Decay Schedule to Improve Deep Q-Networks-Based Agent Performance. In Advances in Intelligent Systems and Computing (Vol. 1056, pp. 651–661). Springer. https://doi.org/10.1007/978-981-15-0199-9_56

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free