Proposal of detour path suppression method in PS reinforcement learning and its application to altruistic multi-agent environment

Daisuke Shiraishi; Kazuteru Miyazaki; Hiroaki Kobayashi

Conference Proceedings

Proposal of detour path suppression method in PS reinforcement learning and its application to altruistic multi-agent environment

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11224 LNAI 638-645

DOI: 10.1007/978-3-030-03098-8_51

0Citations

1Readers

Get full text

Abstract

Profit Sharing is well known as a kind of reinforcement learning. In PS method, a reward is generally distributed with a geometrically decreasing function, and the common ratio of the function is called a discount rate. A large discount rate increases the learning speed, but a non-optimal policy may be learned. On the other hand, a small discount rate improves the performance of the policy, but the learning may not proceed smoothly due to the shallow learning depth. In this paper, in order to cope with these problems, we propose a method that reinforces detour paths and a non-detour path with different discount rates, respectively. Finally, this method is applied to an altruistic multi-agent environment to confirm its effectiveness.

Author supplied keywords

Cite

CITATION STYLE

APA

Shiraishi, D., Miyazaki, K., & Kobayashi, H. (2018). Proposal of detour path suppression method in PS reinforcement learning and its application to altruistic multi-agent environment. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11224 LNAI, pp. 638–645). Springer Verlag. https://doi.org/10.1007/978-3-030-03098-8_51

Proposal of detour path suppression method in PS reinforcement learning and its application to altruistic multi-agent environment

Abstract

Author supplied keywords

Cite

Register to see more suggestions