# 8.4 Probabilistic Inference

The most common probabilistic inference task is to compute the posterior distribution of a query variable or variables given some evidence. Unfortunately, even the problem of estimating the posterior probability in a belief network within an absolute error (of less than $0.5$), or within a constant multiplicative factor, is NP-hard, so general efficient implementations will not be available. Computing the prior probability or the posterior probability of a variable is in a complexity class called $\#NP$ (pronounced “sharp-NP”).

The main approaches for probabilistic inference in belief networks are:

Exact inference

where the probabilities are computed exactly. A simple way is to enumerate the worlds that are consistent with the evidence. It is possible to do much better than this by exploiting the structure of the network. The variable elimination algorithm is an exact algorithm that uses dynamic programming and exploits conditional independence.

Approximate inference

where probabilities are only approximated. These methods are characterized by the different guarantees they provide:

• They produce guaranteed bounds on the probabilities. That is, they return a range $[l,u]$ where the exact probability $p$ is guaranteed to have $l\leq p\leq u$. An anytime algorithm may guarantee that $l$ and $u$ get closer to each other as time (and perhaps space) increases.

• They produce probabilistic bounds on the error produced. Such algorithms might guarantee that the error, for example, is within 0.1 of the correct answer 95% of the time. They might also have guarantees that, as time increases, probability estimates will converge to the exact probability. Some even have guarantees of the rates of convergence. Stochastic simulation is a common method with such guarantees.

• They could make a best effort to produce an approximation that may be good enough, even though there may be cases where they do not work very well. One such class of techniques is called variational inference, where the idea is to find an approximation to the problem that is easy to compute. First choose a class of representations that are easy to compute. This class could be as simple as the set of disconnected belief networks (with no arcs). Next try to find a member of the class that is closest to the original problem. That is, find an easy-to-compute distribution that is as close as possible to the posterior distribution to be computed. Thus, the problem reduces to an optimization problem of minimizing the error, followed by a simple inference problem.

This book presents the variable elimination method and some stochastic simulation methods.