When we apply reinforcement learning onto multi-agent environment, credit assignment problem will occur, because it is sometimes difficult to define which agents are the real contributors. If we praise all agents, when a group of cooperative agents get reward, some agents which did not contribute it will also reinforce their policies. On the other hand, if we praise obvious contributors only, indirect contribution will not be reinforced. For the first step to reduce this dilemma, we propose a classification of reward, and then investigate the feature of it. We treat a positioning task on SoccerServer for the experiments. The empirical results show that direct reward takes effect faster and helps obtaining individuality. On the contrary, indirect reward takes effect slower, but agents tend to form a group and obtain another effective positioning. © Springer-Verlag Berlin Heidelberg 2003.
CITATION STYLE
Ohta, M. (2003). Direct reward and indirect reward in multi-agent reinforcement learning. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 2752, pp. 359–366). Springer Verlag. https://doi.org/10.1007/978-3-540-45135-8_31
Mendeley helps you to discover research relevant for your work.