Point-based online value iteration algorithm for POMDPs

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

Partially observable Markov decision processes (POMDPs) provide a rich framework for sequential decision-making in stochastic domains of uncertainty. However, solving POMDPs is typically computationally intractable because the belief states of POMDPs have two curses: Dimensionality and history, and online algorithms that can not simultaneously satisfy the requirement of low errors and high timeliness. In order to address these problems, this paper proposes a point-based online value iteration (PBOVI) algorithm for POMDPs. This algorithm for speeding up POMDPs solving involves performing value backup at specific reachable belief points, rather than over the entire a belief simplex. The paper exploits branch-and-bound pruning approach to prune the AND/OR tree of belief states online and proposes a novel idea to reuse the belief states that have been computed last time to avoid repeated computation. The experiment and simulation results show that the proposed algorithm has its effectiveness in reducing the cost of computing policies and retaining the quality of the policies, so it can meet the requirement of a real-time system. © 2013 ISCAS.

Cite

CITATION STYLE

APA

Wu, B., Wu, M., & She, J. H. (2013). Point-based online value iteration algorithm for POMDPs. Ruan Jian Xue Bao/Journal of Software, 24(1), 25–36. https://doi.org/10.3724/SP.J.1001.2013.04258

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free