Learning Bayesian Networks from Data

8/11/98


Click here to start

Authors:

Nir Friedman, UC Berkeley

Moises Goldszmidt, SRI International

Table of Contents

  1. Learning Bayesian Networks from Data
  2. Outline
  3. Learning (in this context)
  4. Why learning?
  5. Why learn a Bayesian network?
  6. What will I get out of this tutorial?
  7. Outline
  8. Probability 101
  9. Representing the Uncertainty in a Domain
  10. Probabilistic Independence: a Key for Representation and Reasoning
  11. Probabilistic Independence: a Key for Representation and Reasoning
  12. Bayesian Networks
  13. Bayesian Networks
  14. Bayesian Network Semantics
  15. Monitoring Intensive-Care Patients
  16. Qualitative part
  17. d-separation
  18. Example
  19. I-Equivalent Bayesian Networks
  20. Quantitative Part
  21. What Can We Do with Bayesian Networks?
  22. Bayesian Networks: Summary
  23. Learning Bayesian networks (reminder)
  24. The Learning Problem
  25. Learning Problem
  26. Learning Problem
  27. Learning Problem
  28. Learning Problem
  29. Outline
  30. Example: Binomial Experiment (Statistics 101)
  31. Statistical parameter fitting
  32. The Likelihood Function
  33. Sufficient Statistics
  34. Maximum Likelihood Estimation
  35. Maximum Likelihood Estimation (Cont.)
  36. Example: MLE in Binomial Data
  37. Learning Parameters for the Burglary Story
  38. General Bayesian Networks
  39. General Bayesian Networks (Cont.)
  40. From Binomial to Multinomial
  41. Likelihood for Multinomial Networks
  42. Is MLE all we need?
  43. Bayesian Inference
  44. Bayesian Inference (cont.)
  45. Bayesian Inference (cont.)
  46. Bayesian Inference (cont.)
  47. Example: Binomial Data Revisited
  48. Bayesian Inference and MLE
  49. Dirichlet Priors
  50. Dirichlet Priors (cont.)
  51. Priors Intuition
  52. Effect of Priors
  53. Effect of Priors (cont.)
  54. Conjugate Families
  55. Bayesian Networks and Bayesian Prediction
  56. Bayesian Networks and Bayesian Prediction (Cont.)
  57. Bayesian Prediction(cont.)
  58. Bayesian Prediction(cont.)
  59. Assessing Priors for Bayesian Networks
  60. Learning Parameters: Case Study (cont.)
  61. Learning Parameters: Case Study (cont.)
  62. Learning Parameters: Case Study (cont.)
  63. Learning Parameters: Summary
  64. Outline
  65. Incomplete Data
  66. Missing Values
  67. Missing Values (cont.)
  68. Missing Values (cont.)
  69. Hidden (Latent) Variables
  70. Hidden Variables (cont.)
  71. Learning Parameters from Incomplete Data
  72. Example
  73. Learning Parameters from Incomplete Data (cont.).
  74. MLE from Incomplete Data
  75. Gradient Ascent
  76. Expectation Maximization (EM)
  77. EM (cont.)
  78. EM (cont.)
  79. Example: EM in clustering
  80. EM in Practice
  81. Error on training set (Alarm)
  82. Test set error (alarm)
  83. Parameter value (Alarm)
  84. Parameter value (Alarm)
  85. Parameter value (Alarm)
  86. Bayesian Inference with Incomplete Data
  87. MAP Approximation
  88. Stochastic Approximations
  89. Stochastic Approximations (cont.)
  90. Stochastic Approximations: Gibbs Sampling
  91. Parameter Learning from Incomplete Data: Summary
  92. Outline
  93. Benefits of Learning Structure
  94. Why Struggle for Accurate Structure
  95. Approaches to Learning Structure
  96. Constraints versus Scores
  97. Likelihood Score for Structures
  98. Likelihood Score for Structure (cont.)
  99. Likelihood Score for Structure (cont.)
  100. Avoiding Overfitting
  101. Avoiding Overfitting (cont..)
  102. Minimum Description Length
  103. Minimum Description Length (cont.)
  104. Minimum Description: Complexity Penalization
  105. Minimum Description: Example
  106. Minimum Description: Example (cont.)
  107. Consistency of the MDL Score
  108. Bayesian Inference
  109. Marginal Likelihood: Binomial case
  110. Marginal Likelihood: Binomials (cont.)
  111. Binomial Likelihood: Example
  112. Marginal Likelihood: Example (cont.)
  113. Marginal Likelihood: Multinomials
  114. Marginal Likelihood: Bayesian Networks
  115. Marginal Likelihood (cont.)
  116. Priors and BDe score
  117. Bayesian Score: Asymptotic Behavior
  118. Bayesian Score: Asymptotic Behavior
  119. Scores -- Summary
  120. Outline
  121. Optimization Problem
  122. Learning Trees
  123. Learning Trees (cont.)
  124. Learning Trees (cont)
  125. Learning Trees: Example
  126. Beyond Trees
  127. Heuristic Search
  128. Heuristic Search (cont.)
  129. Exploiting Decomposability in Local Search
  130. Greedy Hill-Climbing
  131. Greedy Hill-Climbing (cont.)
  132. Greedy Hill-Climbing (cont.)
  133. Greedy Hill-Climbing
  134. Other Local Search Heuristics
  135. I-Equivalence Class Search
  136. I-Equivalence Class Search (cont.)
  137. Search and Statistics
  138. Learning in Practice: Time & Statistics
  139. Learning in Practice: Alarm domain
  140. Model Averaging
  141. Model Averaging (cont.)
  142. Model Averaging (cont.)
  143. Search: Summary
  144. Outline
  145. Local and Global Structure
  146. Local structure: Decision trees
  147. Learning decision trees
  148. Effects on learning
  149. Local Structure ? More Accurate Global Structure
  150. Local structure: Noisy Or
  151. Local structure: Noise-Or decomposition
  152. Other Types of Local Structure
  153. Outline
  154. The Classification Problem
  155. Examples
  156. Approaches
  157. Generative Models
  158. Optimality of the decision rule Minimizing the error rate...
  159. Advantages of the Generative Model Approach
  160. Advantages of Using a Bayesian Network
  161. The Naive Bayesian Classifier
  162. The Naive Bayesian Classifier (cont.)
  163. Improving Naive Bayes
  164. Tree Augmented Naive Bayes (TAN)
  165. Evaluating the performance of a classifier: n-fold cross validation
  166. Performance: TAN vs. Naive Bayes
  167. Performance: TAN vs C4.5
  168. Beyond TAN
  169. Performance: TAN vs. Bayesian Networks
  170. What is the problem?
  171. Classification: Summary
  172. Outline
  173. Learning Causal Relations (Thanks to David Heckerman and Peter Spirtes for the slides)
  174. Causal Discovery by Experiment
  175. What is 'Cause' Anyway?
  176. Probabilistic vs. Causal Models
  177. To Predict the Effects of Actions: Modify the Causal Graph
  178. Causal Model
  179. Ideal Interventions
  180. How Can We Learn Cause and Effect from Observational Data?
  181. Learning Cause from Observations: Constraint-Based Approach
  182. Causal Markov assumption
  183. Faithfulness
  184. Other assumptions
  185. All models under consideration are causal
  186. Learning Cause from Observations: Constraint-based method
  187. Learning Cause from Observations: Constraint-based method
  188. Learning Cause from Observations: Constraint-based method
  189. Learning Cause from Observations: Constraint-based method
  190. Learning Cause from Observations: Constraint-Based Method
  191. Cannot Always Learn Cause
  192. But with four (or more) variables . . .
  193. Constraint-Based Approach
  194. The Bayesian Approach
  195. The Bayesian approach
  196. Assumptions
  197. Definition of Model Hypothesis G
  198. Faithfulness
  199. Causes of publishing productivity Rodgers and Maranto 1989
  200. Causes of publishing productivity
  201. Results of Greedy Search...
  202. Other Models
  203. Bayesian Model Averaging
  204. Challenges for the Bayesian Approach
  205. Benefits of the Two Approaches
  206. Summary
  207. Outline
  208. Learning Structure for Incomplete Data
  209. Incomplete Data : Structure Scores
  210. Incomplete Data : Structure Scores (cont.)
  211. PPT Slide
  212. PPT Slide
  213. Problem
  214. PPT Slide
  215. Structural EM
  216. Structural EM
  217. Expected scores
  218. How do we choose Q(H)?
  219. Structural EM for MDL
  220. PPT Slide
  221. Structural EM in Practice
  222. The Structural EM Procedure
  223. Structural EM: Convergence Properties
  224. Learning Structure from Incomplete Data: Summary
  225. Outline
  226. Summary: Learning Bayesian Networks
  227. Untouched issues
  228. Untouched Issues (Cont.)
  229. Some Applications
  230. Systems
  231. Systems (Cont.)
  232. Current Topics
  233. Perspective: What's Old and What's New
  234. The Future...
  235. Many thanks to...