My current research deals with approximate solutions for tightly coupled partially observable stochastic games with common payoffs.

Partially Observable Markov decision processes (POMDPs) are a rich framework for modelling uncertainty in robot control problems. Unlike Markov decision processes (MDPs), however, they cannot be directly applied to muli-robot problems, even when a robot team has a common reward structure, unless that team also has unlimited, instantaneous communication. Instead, game theory must be used to model how each member of the team will make individual decision about what actions to take given its own observation histories and a model of its teammates. I model control problems in this domain as partially observable stochastic games (POSGs).

The following figure perhaps best explains why each robot cannot model the problem as a POMDP. In the single robot case, a POMDP transforms uncertainty over a set of states into a single point in belief space and MDP-planning can then be performed in that new space. In a POSG, however, uncertainty still remains over a set of points in belief space. This is because while each robot knows its own history of observations and actions, it does not know those of its teammates with certainty and therefore has to maintain a probability distribution over the possible joint state of the team.

Solving a POSG is intractable for all but extremely small problems with limited finite horizon. I am interested in approximating the optimal solution to a POSG by using an approach that interleaves planning and execution. In my algorithm, robots find a one-step policy at each timestep and use the policies from earlier timesteps and common observations to prune down the number of different joint histories of observations and actions that must be tracked.

This approach has been applied successful to robot tag problems in both simulation and on real robots. For example, this animated gif shows the paths taken by a robot team searching for an opponent in the A-wing of Gates Hall.