8.6 Stochastic Simulation

The third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including full text).

8.6.4 Likelihood Weighting

Instead of creating a sample and then rejecting it, it is possible to mix sampling with inference to reason about the probability that a sample would be rejected. In importance sampling methods, each sample has a weight, and the sample average is computed using the weighted average of samples. Likelihood weighting is a form of importance sampling where the variables are sampled in the order defined by a belief network, and evidence is used to update the weights. The weights reflect the probability that a sample would not be rejected.

Example 8.42.

Consider the belief network of Figure 8.3. In this P(fire)=0.01, P(smokefire)=0.9 and P(smoke¬fire)=0.01. Suppose Smoke=true is observed, and another descendant of Fire is queried.

Starting with 1000 samples, approximately 10 will have Fire=true, and the other 990 samples will have Fire=false. In rejection sampling, of the 990 with Fire=false, 1%, which is approximately 10, will have Smoke=true and so will not be rejected. The remaining 980 samples will be rejected. Of the 10 with Fire=true, about 9 will not be rejected. Thus about 98% of the samples are rejected.

Instead of rejecting so many samples, the samples with Fire=true are weighted by 0.9 and the samples with Fire=false are weighted with 0.01. This potentially give a much better estimate of any of the probabilities that use these samples.

1: procedure Likelihood_weighting(B,e,Q,n):
2:      Inputs
3:          B: belief network
4:          e: the evidence; a variable-value assignment to some of the variables
5:          Q: query variable
6:          n: number of samples to generate      
7:      Output
8:          posterior distribution on Q
9:      Local
10:          array sample[var], where sample[var]domain(var)
11:          real array counts[k] for kdomain(Q), initialized to 0      
12:      repeat n times
13:          sample:={}
14:          weight:=1
15:          for each variable X in B, in order do
16:               if X=v is in e then
17:                   sample[X]:=v
18:                   weight:=weight*P(X=vparents(X))
19:               else
20:                   sample[X]:= a random sample from P(Xparents(X))                        
21:          v:=sample[Q]
22:          counts[v]:=counts[v]+weight      
23:      return counts/vcounts[v]
Figure 8.30: Likelihood weighting for belief network inference

Figure 8.30 shows the details of the likelihood weighting for computing P(Qe) for query variable Q and evidence e. The for loop (from line 15) creates a sample containing a value for all of the variables. Each observed variable changes the weight of the sample by multiplying by the probability of the observed value given the assignment of the parents in the sample. The variables not observed are sampled according the probability of the variable given its parents in the sample. Note that the variables are sampled in an order to ensure that the parents of a variable have been assigned in the sample before the variable is selected.

To extract the distribution of the query variable Q, the algorithm maintain an array counts, such that counts[v] is the sum of the weights of the samples where Q=v. This algorithm can also be adapted to the case where the query is some complicated condition on the values; we just have to count the cases where the condition is true and those where the condition is false.

Example 8.43.

Suppose we want to use likelihood weighting to compute P(Tamperingsmoke¬report).

The following table gives a few samples. In this table, s is the sample; e is ¬smokereport. The weight is P(es), which is equal to P(smokeFire)*P(¬reportLeaving), where the value for Fire and Leaving are from the sample.

Tampering Fire Alarm Smoke Leaving Report weight
false true false true true false 0.9*0.25=0.225
true true true true false false 0.9*0.99=0.891
false false false true true false 0.01*0.25=0.0025
false true false true false false 0.9*0.99=0.891

P(tampering¬smokereport) is estimated from the weighted proportion of the samples that have Tampering true.