Parameter Learning from Incomplete Data: Summary
Non-linear optimization problem
Methods for learning: EM and Gradient Ascent
- Exploit inference for learning
-
Exploration of a complex likelihood/posterior
- More missing data ? many more local maxima
- Cannot represent posterior ? must resort to approximations
Inference
- Main computational bottleneck for learning
- Learning large networks ? exact inference is infeasible ? resort to stochastic simulation or approximate inference (e.g., see Jordan’s tutorial)