Gradient Ascent
Requires computation: P(xi,Pai|o[m],?) for all i, m
Pros:
- Flexible
- Closely related to methods in neural network training
Cons:
- Need to project gradient onto space of legal parameters
- To get reasonable convergence we need to combine with “smart” optimization techniques