Satisficing Strategies for Resource-Limited Policy Search in Dynamic Environments

Dmitri A. Dolgov, and Edmund H. Durfee

In Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems. Pages 1325--1332. 2002.

Copyright © 2002 ACM. Publisher's version is available at http://doi.acm.org/10.1145/545056.545126.

Abstract
In this work, we examine the problem of searching for schedulable real-time control policies for resource-limited agents acting in dynamic environments. The dynamic properties of the environment and resource limitations of the agent render the problem of solving for an optimal policy infeasible. We therefore limit our search to a satisficing, rather than an optimal policy. We view the policy search as a search for a state reachability graph, with an action assigned to each of the states in the graph. In the search algorithm, we exploit properties of the reachability graphs to propagate failure conditions from inherent failure states to other states in the reachability graph, which allows us to exploit constraint satisfaction techniques to quickly remove some unacceptable policies from consideration. Our analysis and experiments show that, under certain conditions, such as when the 'safe' states in the reachability graph are separated from the failure states by a relatively small set of states, we can use backtracking and memoization techniques that significantly improve the efficiency of the search algorithm.

BibTex
@inproceedings{ dolgov02satisficing,
   paperID   = "AAMAS-02",
   author    = "Dmitri A. Dolgov and Edmund H. Durfee",
   series    = "3",
   booktitle = "Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems",
   address   = "Bologna, Italy",
   title     = "Satisficing Strategies for Resource-Limited Policy Search in Dynamic Environments",
   year      = "2002",
   pages     = "1325--1332"
}