My home page
My group
Professional activities

Daphne Koller Publications

Policy Search via Density Estimation (2000)

by A.Y. Ng, R. Parr, and D. Koller

Abstract: We propose a new approach to the problem of searching a space of stochastic controllers for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP). Following several other authors, our approach is based on searching in parameterized families of policies (for example, via gradient descent) to optimize solution quality. However, rather than trying to estimate the values and derivatives of a policy directly, we do so indirectly using estimates for the probability densities that the policy induces on states at the different points in time. This enables our algorithms to exploit the many techniques for efficient and robust approximate density propagation in stochastic systems. We show how our techniques can be applied both to deterministic propagation schemes (where the MDP's dynamics are given explicitly in compact form) and to stochastic propagation schemes (where we have access only to a generative model, or simulator, of the MDP). We present empirical results for both of these variants on complex problems.

Download Information

A.Y. Ng, R. Parr, and D. Koller (2000). "Policy Search via Density Estimation." Advances in Neural Information Processing Systems (NIPS '99). pdf ps.gz

Bibtex citation

  author =       "A.Y. Ng and R. Parr and D. Koller",
  booktitle =    "Advances in Neural Information Processing Systems
                 (NIPS '99)",
  title =        "Policy Search via Density Estimation",
  year = 2000,

full list
Click to go to robotics Click to go to theory Click to go to CS Stanford Click to go to Stanford's Webpage
home | biography | research | papers | my group
courses | professional activities | FAQ | personal