Point-based online value iteration algorithm for POMDPs

Bo Wu; Min Wu; Jin Hua She

Journal ArticleOPEN ACCESS

Point-based online value iteration algorithm for POMDPs

Ruan Jian Xue Bao/Journal of Software (2013) 24(1) 25-36

DOI: 10.3724/SP.J.1001.2013.04258

2Citations

5Readers

Abstract

Partially observable Markov decision processes (POMDPs) provide a rich framework for sequential decision-making in stochastic domains of uncertainty. However, solving POMDPs is typically computationally intractable because the belief states of POMDPs have two curses: Dimensionality and history, and online algorithms that can not simultaneously satisfy the requirement of low errors and high timeliness. In order to address these problems, this paper proposes a point-based online value iteration (PBOVI) algorithm for POMDPs. This algorithm for speeding up POMDPs solving involves performing value backup at specific reachable belief points, rather than over the entire a belief simplex. The paper exploits branch-and-bound pruning approach to prune the AND/OR tree of belief states online and proposes a novel idea to reuse the belief states that have been computed last time to avoid repeated computation. The experiment and simulation results show that the proposed algorithm has its effectiveness in reducing the cost of computing policies and retaining the quality of the policies, so it can meet the requirement of a real-time system. © 2013 ISCAS.

Author supplied keywords

Cite

CITATION STYLE

APA

Wu, B., Wu, M., & She, J. H. (2013). Point-based online value iteration algorithm for POMDPs. Ruan Jian Xue Bao/Journal of Software, 24(1), 25–36. https://doi.org/10.3724/SP.J.1001.2013.04258

Point-based online value iteration algorithm for POMDPs

Abstract

Author supplied keywords

Cite

Register to see more suggestions