Approximate Probabilistic Constraints and Risk-Sensitive Optimization Criteria in Markov Decision Processes

Dmitri A. Dolgov, and Edmund H. Durfee

In Proceedings of the Eighth International Symposiums on Artificial Intelligence and Mathematics (ISAIM 7-2004). January 2004.

Abstract
The majority of the work in the area of Markov decision processes has focused on expected values of rewards in the objective function and expected costs in the constraints. Although several methods have been proposed to model risk-sensitive utility functions and constraints, they are only applicable to certain classes of utility functions and allow limited expressiveness in the constraints. We propose a construction that extends the standard linear programming formulation of MDPs by augmenting it with additional optimization variables, which allows us to compute the higher order moments of the total costs (and/or reward). This greatly increases the expressive power of the model, and supports reasoning about the probability distributions of the total costs (reward). Consequently, this allows us to formulate more interesting constraints and to model a wide range of utility functions. In particular, in this work we show how to formulate the constraint that bounds the probability of the total incurred costs falling within a given range. Constraints of that type arise, for example, when one needs to bound the probability of overutilizing a consumable resource. Our construction, which greatly increases the expressive power of our model, unfortunately comes at the cost of significantly increasing the size and the complexity of the optimization program. On the other hand, it allows one to choose how many higher order moments of the costs (and/or reward) are modeled as a means of balancing accuracy against computational effort.


BibTex
@inproceedings{ dolgov04moments,
   paperID   = "ISAIM-04",
   month     = "January",
   author    = "Dmitri A. Dolgov and Edmund H. Durfee",
   booktitle = "Proceedings of the Eighth International Symposiums on Artificial Intelligence and Mathematics (ISAIM 7-2004)",
   address   = "Florida",
   title     = "Approximate Probabilistic Constraints and Risk-Sensitive Optimization Criteria in {Markov} Decision Processes",
   year      = "2004"
}


Download:
pdf [pdf]        ps [ps.gz]