What is the problem?

Objective function
- Learning of arbitrary Bayesian networks optimizes P(C, F1,...,Fn)
- It may learn a network that does a great job on P(Fi,...,Fj) but a poor job on P(C | F1,...,Fn) (Given enough data… No problem…)
- We want to optimize classification accuracy or at least the conditional likelihood P(C | F1,...,Fn)
  - Scores based on this likelihood do not decompose ? learning is computationally expensive!
  - Controversy as to the correct form for these scores

Naive Bayes, Tan, etc circumvent the problem by forcing a structure where all features are connected to the class

Previous slide Next slide Back to first slide View graphic version