13.4 Proofs and Substitutions

The third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including full text).

13.4.4 Definite Resolution with Variables

The top-down proof procedure can be extended to handle variables by allowing instances of rules to be used in the derivation.

A generalized answer clause is of the form

yes(t1,,tk)a1a2am

where t1,,tk are terms and a1,,am are atoms. The use of yes enables answer extraction: determining which instances of the query variables are a logical consequence of the knowledge base.

Initially, the generalized answer clause for query q is

yes(V1,,Vk)q

where V1,,Vk are the variables that appear in q. Intuitively this means that an instance of yes(V1,,Vk) is true if the corresponding instance of the query is true.

The proof procedure maintains a current generalized answer clause.

At each stage, the algorithm selects an atom in the body of the generalized answer clause. It then chooses a clause in the knowledge base whose head unifies with the atom.

The SLD resolution of the generalized answer clause

yes(t1,,tk)a1a2am

on a1 with the chosen clause

ab1bp

where a1 and a have most general unifier σ, is the answer clause

(yes(t1,,tk)b1bpa2am)σ

where the body of the chosen clause has replaced a1 in the answer clause, and the MGU σ is applied to the whole answer clause.

An SLD derivation is a sequence of generalized answer clauses γ0, γ1, , γn such that

  • γ0 is the answer clause corresponding to the original query. If the query is q, with free variables V1,,Vk, the initial generalized answer clause γ0 is

    yes(V1,,Vk)q.
  • γi is obtained by selecting an atom a1 in the body of γi-1; choosing a copy of a clause ab1bp in the knowledge base whose head, a, unifies with ai; replacing a1 with the body, b1bp; and applying the unifier to the whole resulting answer clause.

    The main difference between this and the propositional top-down proof procedure is that, for clauses with variables, the proof procedure must take copies of clauses from the knowledge base. The copying renames the variables in the clause with new names. This is both to remove name clashes between variables and because a single proof may use different instances of a clause.

  • γn is an answer. That is, it is of the form

    yes(t1,,tk).

    When this occurs, the algorithm returns the answer

    V1=t1,,Vk=tk.

Notice how the answer is extracted; the arguments to yes keep track of the instances of the variables in the initial query that lead to a successful proof.

1: non-deterministic procedure Prove_datalog_TD(KB,q)
2:      Inputs
3:          KB: a set of definite clauses
4:          Query q: a set of atoms to prove, with variables V1,,Vk      
5:      Output
6:          substitution θ if KBqθ and fail otherwise
7:      Local
8:          G is a generalized answer clause      
9:      Set G to generalized answer clause yes(V1,,Vk)q
10:      while G is not an answer do
11:          Suppose G is yes(t1,,tk)a1a2am
12:          select atom a1 in the body of G
13:          choose clause ab1bp in KB
14:          Rename all variables in ab1bp to have new names
15:          Let σ be Unify(a1,a). Fail if Unify returns .
16:          G:=(yes(t1,,tk)b1bpa2am)σ      
17:      return {V1=t1,,Vk=tk} where G is yes(t1,,tk)
Figure 13.4: Top-down definite clause proof procedure for Datalog

Figure 13.4 gives a non-deterministic algorithm that answer queries by searching for SLD derivations. This is non-deterministic in the sense that all derivations can be found by making appropriate choices that do not fail. If all choices fail, the algorithm fails, and there are no derivations. The “choose” on line 13 is implemented using search. Recall that Unify(ai,a) returns an MGU of ai and a, if there is one, and if they do not unify. The algorithm for Unify is given in Figure 13.3.

Example 13.23.

Consider the database of Figure 13.2 and the query

𝘢𝘴𝘬 two_doors_east(R,r107).

Figure 13.5 shows a successful derivation with answer R=r111.

yes(R)two_doors_east(R,r107)
    resolve with two_doors_east(E1,W1)
           imm_east(E1,M1)imm_east(M1,W1).
    substitution: {E1/R,W1/r107}
yes(R)imm_east(R,M1)imm_east(M1,r107)
    select leftmost conjunct
    resolve with imm_east(E2,W2)imm_west(W2,E2)
    substitution: {E2/R,W2/M1}
yes(R)imm_west(M1,R)imm_east(M1,r107)
    select leftmost conjunct
    resolve with imm_west(r109,r111)
    substitution: {M1/r109,R/r111}
yes(r111)imm_east(r109,r107)
    resolve with imm_east(E3,W3)imm_west(W3,E3)
    substitution: {E3/r109,W3/r107}
yes(r111)imm_west(r107,r109)
    resolve with imm_west(r107,r109)
    substitution: {}
yes(r111)
Figure 13.5: A derivation for query two_doors_east(R,r107)

Note that this derivation used two instances of the rule

imm_east(E,W)imm_west(W,E).

One instance eventually substituted r111 for E, and one instance substituted r109 for E.

When the atom imm_west(M1,R) was selected, other choices of the clause to resolve with would have resulted in a partial derivation that could not be completed.