Discovering strategy in navigation problem

Nurulhidayati Haji Mohd Sani; Somnuk Phon-Amnuaisuk; Thien Wan Au

Conference Proceedings

Discovering strategy in navigation problem

Communications in Computer and Information Science (2019) 1071 231-239

DOI: 10.1007/978-981-32-9563-6_24

0Citations

2Readers

Get full text

Abstract

This paper explores ways to discover strategy from a state-action-state-reward log recorded during a reinforcement learning session. The term strategy here implies that we are interested not only in a one-step state-action but also a fruitful sequence of state-actions. Traditional RL has proved that it can successfully learn a good sequence of actions. However, it is often observed that some of the action sequences learned could be more effective. For example, an effective five-step navigation to the north direction can be achieved in thousands of ways if there are no other constraints since an agent could move in numerous tactics to achieve the same end result. Traditional RL such as value learning or state-action value learning does not directly address this issue. In this preliminary experiment, sets of state-action (i.e., a one-step policy) are extracted from 10,446 records, grouped together and then joined together forming a directed graph. This graph summarizes the policy learned by the agent. We argue that strategy could be extracted from the analysis of this graph network.

Author supplied keywords

Cite

CITATION STYLE

APA

Haji Mohd Sani, N., Phon-Amnuaisuk, S., & Au, T. W. (2019). Discovering strategy in navigation problem. In Communications in Computer and Information Science (Vol. 1071, pp. 231–239). Springer Verlag. https://doi.org/10.1007/978-981-32-9563-6_24

Discovering strategy in navigation problem

Abstract

Author supplied keywords

Cite

Register to see more suggestions