Improving Anytime Point-Based Value Iteration Using Principled Point SelectionsIn Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI-07). Pages 865--871. January 2007.
Copyright © 2007 IJCAI. Online proceedings are available at http://www.ijcai.org/papers07/contents.php.
Planning in partially-observable dynamical systems (such as POMDPs and PSRs) is a computationally challenging task. Popular approximation techniques that have proved successful are point-based planning methods including point-based value iteration (PBVI), which works by approximating the solution at a finite set of points. These point-based methods typically are anytime algorithms, whereby an initial solution is obtained using a small set of points, and the solution may be incrementally improved by including additional points. We introduce a family of anytime PBVI algorithms that use the information present in the current solution for identifying and adding new points that have the potential to best improve the next solution. We motivate and present two different methods for choosing points and evaluate their performance empirically, demonstrating that high-quality solutions can be obtained with significantly fewer points than previous PBVI approaches.
paperID = "IJCAI-07",
month = "January",
author = "Michael R. James and Michael E. Samples and Dmitri A. Dolgov",
booktitle = "Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI-07)",
address = "Hyderabad, India",
title = "Improving Anytime Point-Based Value Iteration Using Principled Point Selections",
pages = "865--871",
year = "2007"