8.6 Stochastic Simulation

The third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including full text).

8.6.6 Particle Filtering

Importance sampling enumerates the samples one at a time and, for each sample, assigns a value to each variable. It is also possible to start with all of the samples and, for each variable, generate a value for that variable for each sample. For example, for the data of Figure 8.28, the same data could be generated by generating all of the samples for Tampering before generating the samples for Fire. The particle filtering algorithm or sequential Monte Carlo generates all the samples for one variable before moving to the next variable. It does one sweep through the variables, and for each variable it does a sweep through all of the samples. This algorithm is advantageous when variables are generated dynamically and when there are unboundedly many variables, as in the sequential models. It also allows for a new operation of resampling.

In particle filtering, the samples are called particles. A particle is a variable-value dictionary, where a dictionary is a representation of a partial function from keys into values; here the key is a variable and the particle maps to its value. A particle has an associated weight. A set of particles is a population.

The algorithm starts with a population of n empty dictionaries. The algorithm repeatedly selects a variable according to an ordering where a variable is selected after its parents. If the variable is not observed, for each particle, a value for the variable for that particle is sampled from the distribution of the variable given the assignment of the particle. If the variable is observed, each particle’s weight is updated by multiplying by the probability of the observation given the assignment of the particle.

Given a population of n particles, resampling generates a new population of n particles, each with the same weight. Each particle is selected with probability proportional to its weight. Resampling can be implemented in the same way that random samples for a single random variable are generated, but particles, rather than values, are selected. Some particles may be selected multiple times and others might not be selected at all.

1: procedure Particle_filtering(B,e,Q,n):
2:      Inputs
3:          B: belief network
4:          e: the evidence; a variable-value assignment to some of the variables
5:          Q: query variable
6:          n: number of samples to generate      
7:      Output
8:          posterior distribution on Q
9:      Local
10:          particles is a set of particles
11:          array counts[k] where k in domain(Q)      
12:      particles:= list of n empty particles
13:      for each variable X in B, in order do
14:          if X=v is observed in e then
15:               for each part in particles do
16:                   part[X]:=v
17:                   weight[part]:=weight*P(X=vpart[parents(X)])               
18:               particles:= n particles selected from particles according to weight
19:          else
20:               for each part in particles do
21:                   sample v from distribution P(Xpart[parents(X)])
22:                   part[X]:=v                             
23:      for each v in domain(Q) do
24:          counts[v]:= (number of part in particles s.th. part[Q]=v)/n      
25:      return counts
Figure 8.31: Particle filtering for belief network inference

The particle filtering algorithm is shown in Figure 8.31. Line 16 assigns X its observed value. Line 17, which is used when X observed, updates the weights of the particles according to the probability of the observation on X. Line 22 assigns X a value sampled from the distribution of X given the values of its parents in the particle.

This algorithm resamples after each observation. It is also possible to resample less often, for example, after a number of variables are observed.

Importance sampling is equivalent to particle filtering without resampling. The principal difference is the order in which the particles are generated. In particle filtering, each variable is sampled for all particles, whereas, in importance sampling, each particle (sample) is sampled for all variables before the next particle is considered.

Particle filtering has two main advantages over importance sampling. First, it can be used for an unbounded number of variables, as in hidden Markov models and dynamic belief networks. Second, resampling enables the particles to better cover the distribution over the variables. Whereas importance sampling will result in some particles that have very low probability, with only a few of the particles covering most of the probability mass, resampling lets many particles more uniformly cover the probability mass.

Example 8.45.

Consider using particle filtering to compute P(tamperingsmokereport) for the belief network of Figure 8.3. First generate the particles s1,,s1000. Suppose it first samples Fire. Out of the 1000 particles, about 10 will have Fire=true and about 990 will have Fire=false (as P(fire)=0.01). It then absorbs the evidence Smoke=true. Those particles with Fire=true will be weighted by 0.9 as P(smokefire)=0.9 and those particles with Fire=false will be weighted by 0.01 as P(smoke¬fire)=0.01. It then resamples; each particle is chosen in proportion to its weight. The particles with Fire=true will be chosen in the ratio 990*0.01:10*0.9. Thus, about 524 particles will be chosen with Fire=true, and the remainder with Fire=false. The other variables are sampled, in turn, until Report is observed, and the particles are resampled. At this stage, the probability of Tampering=true will be the proportion of the samples with tampering being true.

Note that in particle filtering the particles are not independent, so Hoeffding’s inequality is not directly applicable.