Maxnorm Projections for Factored MDPs (2001)by C. Guestrin, D. Koller, and R. Parr
Abstract:
Markov Decision Processes (MDPs) provide a coherent mathematical framework for planning under uncertainty. However, exact MDP solution algorithms require the manipulation of a value function, which specifies a value for each state in the system. Most realworld MDPs are too large for such a representation to be feasible, preventing the use of exact MDP algorithms. Various approximate solution algorithms have been proposed, many of which use a linear combination of basis functions as a compact approximation to the value function. Almost all of these algorithms use an approximation based on the (weighted) L2norm (Euclidean distance); this approach prevents the application of standard convergence results for MDP algorithms, all of which are based on maxnorm. This paper makes two contributions. First, it presents the first approximate MDP solution algorithms  both value and policy iteration  that use maxnorm projection, thereby directly optimizing the quantity required to obtain the best error bounds. Second, it shows how these algorithms can be applied efficiently in the context of factored MDPs, where the transition model is specified using a dynamic Bayesian network.
Download Information
C. Guestrin, D. Koller, and R. Parr (2001). "Maxnorm Projections for Factored MDPs." Seventeenth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 673680).


Bibtex citation
@inproceedings{Guestrin+al:IJCAI01,
title = {Maxnorm Projections for Factored {MDPs}},
author = {C. Guestrin and D. Koller and R. Parr},
booktitle = {Seventeenth International Joint Conference on Artificial Intelligence (IJCAI)},
address = {Seattle, Washington},
month = {August},
year = 2001,
pages = {673680},
}
full list
