Q-learning policies for multi-agent foraging task

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The trade-off issue between exploitation and exploration in multi-agent systems learning have been a crucial area of research for the past few decades. A proper learning policy is necessary to address the issue for the agents to react rapidly and adapt in a dynamic environment. A family of core learning policies were identified in the open literature that are suitable for non-stationary multi-agent foraging task modeled in this paper. The model is used to compare and contrast between the identified learning policies namely greedy, ε-greedy and Boltzmann distribution. A simple random search is also included to justify the convergence of q-learning. A number of simulation-based experiments was conducted and based on the numerical results that was obtained, the performances of the learning policies are discussed. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Yogeswaran, M., & Ponnambalam, S. G. (2010). Q-learning policies for multi-agent foraging task. In Communications in Computer and Information Science (Vol. 103 CCIS, pp. 194–201). https://doi.org/10.1007/978-3-642-15810-0_25

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free