Artificial Intelligence - foundations of computational agents -- 12.5.1 Proof Procedures with Function Symbols

Third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including the full text).

12.5.1 Proof Procedures with Function Symbols

The proof procedures with variables carry over for the case with function symbols. The main difference is that the class of terms is expanded to include function symbols.

The use of function symbols involves infinitely many terms. This means that, when forward chaining on the clauses, we have to ensure that the selection criterion for selecting clauses is fair.

Example 12.29: To see why fairness is important, consider the following clauses as part of a larger program:

num(0).
num(s(N))←num(N).

An unfair strategy could initially select the first of these clauses to forward chain on and, for every subsequent selection, select the second clause. The second clause can always be used to derive a new consequence. This strategy never selects any other clauses and thus never derives the consequences of these other clauses.

This problem of ignoring some clauses forever is known as starvation. A fair selection criterion is one such that any clause available to be selected will eventually be selected. The bottom-up proof procedure is complete only if it is fair.

The top-down proof procedure is the same as for Datalog [see Figure 12.3]. Unification becomes more complicated, because it must recursively descend into the structure of terms. There is one change to the unification algorithm: a variable X does not unify with a term t in which X occurs and is not X itself. Checking for this condition is known as the occurs check. If the occurs check is not used and a variable is allowed to unify with a term in which it appears, the proof procedure becomes unsound, as shown in the following example.

Example 12.30: Consider the knowledge base with only one clause:

lt(X,s(X)).

Suppose the intended interpretation is the domain of integers in which lt means "less than" and s(X) denotes the integer after X. The query ask lt(Y,Y) should fail because it is false in our intended interpretation that there is no number less than itself. However, if X and s(X) could unify, this query would succeed. In this case, the proof procedure would be unsound because something could be derived that is false in a model of the axioms.

The unification algorithm of Figure 12.5 finds the MGU of two terms with function symbols with one change. The algorithm should return ⊥ if it selects an equality x=y, where x is a variable and y is a term that is not x, but contains x. This last step is the occurs check. The occurs check is sometimes omitted (e.g., in Prolog), because removing it makes the proof procedure more efficient, even though removing it makes the proof procedure unsound.

The following example shows the details of SLD resolution with function symbols.

Example 12.31: Consider the clauses

append(c(A,X),Y,c( A,Z))←
append(X,Y,Z).
append( nil,Z,Z).

For now, ignore what this may mean. Like the computer, treat this as a problem of symbol manipulation. Consider the following query:

ask append(F,c(L,nil),c(l,c(i,c(s,c(t,nil))))).

The following is a derivation:

yes(F,L)←append(F,c(L,nil),c(l,c(i,c(s,c(t,nil)))))
     resolve with append(c(A₁,X₁),Y₁,c( A₁,Z₁))←append(X₁,Y₁,Z₁)
     substitution: {F/c(l,X₁), Y₁/c(L,nil), A₁/l ,Z₁/c(i,c(s,c(t,nil)))}
yes(c(l,X₁),L)←append(X₁,c(L,nil),c(i,c(s,c(t,nil))))
     resolve with append(c(A₂,X₂),Y₂,c( A₂,Z₂))←append(X₂,Y₂,Z₂)
     substitution: {X₁/c(i,X₂), Y₂/c(L,nil), A₂/i ,Z₂/c(s,c(t,nil))}
yes(c(l,c(i,X₂)),L)←append(X₂,c(L,nil),c(s,c(t,nil)))
     resolve with append(c(A₃,X₃),Y₃,c( A₃,Z₃))←append(X₃,Y₃,Z₃)
     substitution: {X₂/c(s,X₃), Y₃/c(L,nil), A₃/s ,Z₃/c(t,nil)}
yes(c(l,c(i,c(s,X₃))),L)←append(X₃,c(L,nil),c(t,nil))

At this stage both clauses are applicable. Choosing the first clause gives

resolve with append(c(A₄,X₄),Y₄,c( A₄,Z₄))←append(X₄,Y₄,Z₄)
substitution: {X₃/c(t,X₄), Y₄/c(L,nil), A₄/t ,Z₄/nil}
yes(c(l,c(i,c(s,X₃))),L)←append(X₄,c(L,nil),nil)

At this point, there are no clauses whose head unifies with the atom in the generalized answer clause's body. The proof fails.

Choosing the second clause instead of the first gives

resolve with append(nil,Z₅,Z₅).
substitution: {Z₅/c(t,nil),X₃/nil,L/t}
yes(c(l,c(i,c(s,nil))),t)←

At this point, the proof succeeds, with answer F=c(l,c(i,c(s,nil))), L=t.

For the rest of this chapter, we use the "syntactic sugar" notation of Prolog for representing lists. The empty list, nil, is written as []. The list with first element E and the rest of the list R, which was cons(E,R), is now written as [E|R]. There is one notational simplification: [X|[Y]] is written as [X,Y], where Y can be a sequence of values. For example, [a|[]] is written as [a], and [b|[a|[]]] is written as [b,a]; [a|[b|c]] is written as [a,b|c].

Example 12.32: Using the list notation, append from the previous example can be written as

append([A|X],Y,[A|Z])←
append(X,Y,Z).
append( [],Z,Z).

The query

ask append(F,[L],[l,i,s,t])

has an answer F=[l,i,s], L=t. The proof is exactly as in the previous example. As far as the proof procedure is concerned, nothing has changed; there is just a renamed function symbol and constant.