Optimality of the decision ruleMinimizing the error rate...
Let ci be the true class, and let lj be the class returned by the classifier.
A decision by the classifier is correct if ci=lj, and in error if ci? lj.
The error incurred by choose label lj is
Thus, had we had access to P, we minimize error rate by choosing li when which is the decision rule for the Bayesian classifier