Probabilistic networks (also known as Bayesian belief networks)
allow a compact description of complex stochastic relationships among
several random variables. They are rapidly becoming the tool of choice for
uncertain reasoning in artificial intelligence. In this paper, we
investigate the problem of learning probabilistic networks with known
structure and hidden variables. This is an important problem, because
structure is much easier to elicit from experts than numbers, and the world
is rarely fully observable. We present a gradient-descent based algorithm,
and show that the gradient can be computed locally, using information that
is available as a byproduct of standard probabilistic network inference
algorithms. Our results demonstrate that using prior knowledge about the
structure, even with hidden variables, can significantly improve the
learning rate of probabilistic networks. We extend the method to networks
where the conditional probability tables are described using a small number
of parameters. Examples include noisy-OR nodes and dynamic probabilistic
networks. We show how this additional structure can be exploited by our
algorithm to speed up the learning even further. We also suggest how
similar ideas can be used to learn *hybrid* networks, where some of
the nodes take on values in a continuous domain.