Intelligence 2E

foundations of computational agents

Any standard belief-network algorithms, such as variable elimination, can be used to carry out monitoring or smoothing. However, it is possible to take advantage of the fact that time moves forward and that the agent is getting observations in time and is interested in its state at the current time.

In belief monitoring or filtering, an agent computes the probability of the current state given the history of observations. In terms of the HMM of Figure 8.12, for each $i$, the agent wants to compute $P({S}_{i}\mid {o}_{0},\mathrm{\dots},{o}_{i})$, which is the distribution over the state at time $i$ given the particular observation of ${o}_{0},\mathrm{\dots},{o}_{i}$. This is done using variable elimination:

$P({S}_{i}\mid {o}_{0},\mathrm{\dots},{o}_{i})\propto $ | $P({S}_{i},{o}_{0},\mathrm{\dots},{o}_{i})$ | |||

$=$ | $P({o}_{i}\mid {S}_{i})P({S}_{i},{o}_{0},\mathrm{\dots},{o}_{i-1})$ | |||

$=$ | $P({o}_{i}\mid {S}_{i}){\displaystyle \sum _{{S}_{i-1}}}P({S}_{i},{S}_{i-1},{o}_{0},\mathrm{\dots},{o}_{i-1})$ | |||

$=$ | $P({o}_{i}\mid {S}_{i}){\displaystyle \sum _{{S}_{i-1}}}P({S}_{i}\mid {S}_{i-1})P({S}_{i-1},{o}_{0},\mathrm{\dots},{o}_{i-1})$ | |||

$\propto $ | $P({o}_{i}\mid {S}_{i}){\displaystyle \sum _{{S}_{i-1}}}P({S}_{i}\mid {S}_{i-1})P({S}_{i-1}\mid {o}_{0},\mathrm{\dots},{o}_{i-1}).$ | (8.2) |

Suppose the agent has computed the previous belief based on the observations received up until time $i-1$. That is, it has a factor representing $P({S}_{i-1}\mid {o}_{0},\mathrm{\dots},{o}_{i-1})$. This is just a factor on ${S}_{i-1}$. To compute the next belief, it multiplies this by $P({S}_{i}\mid {S}_{i-1})$, sums out ${S}_{i-1}$, multiplies this by the factor $P({o}_{i}\mid {S}_{i})$, and normalizes.

Multiplying a factor on ${S}_{i-1}$ by the factor $P({S}_{i}\mid {S}_{i-1})$ and summing out ${S}_{i-1}$ is an instance of matrix multiplication. Multiplying the result by $P({o}_{i}\mid {S}_{i})$ is called the dot product. Matrix multiplication and dot product are simple instances of variable elimination.

Consider the domain of Example 8.31. An observation of a door involves multiplying the probability of each location ${L}$ by ${P}{\mathrm{(}}{d}{o}{o}{r}{\mathrm{\mid}}{L}{o}{c}{\mathrm{=}}{L}{\mathrm{)}}$ and renormalizing. A move right involves, for each state, doing a forward simulation of the move-right action in that state weighted by the probability of being in that state.

Smoothing is the problem of computing the probability distribution of a state variable in an HMM given past and future observations. The use of future observations can make for more accurate predictions. Given a new observation, it is possible to update all previous state estimates with one sweep through the states using variable elimination; see Exercise 17.