13.5 Function Symbols

The third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including full text).

13.5.1 Proof Procedures with Function Symbols

The proof procedures with variables carry over for the case with function symbols. The main difference is that the class of terms is expanded to include function symbols.

The use of function symbols involves infinitely many terms. This means that, when forward chaining on the clauses, we have to ensure that the selection criterion for selecting clauses is fair.

Example 13.30.

To see why fairness is important, consider the following clauses:

num(0).
num(s(N))num(N).
ab.
b.

An unfair strategy could initially select the first of these clauses to forward chain on and, for every subsequent selection, select the second clause. The second clause can always be used to derive a new consequence. This strategy never selects either of the last two clauses and thus never derives a or b.

This problem of ignoring some clauses forever is known as starvation. A fair selection criterion is one such that any clause available to be selected will eventually be selected. The bottom-up proof procedure can generate an infinite sequence of consequences and if the selection is fair, each consequence will eventually be generated and so the proof procedure is complete.

The top-down proof procedure is the same as for Datalog (see Figure 13.4). Unification becomes more complicated, because it must recursively descend into the structure of terms. There is one change to the unification algorithm: a variable X does not unify with a term t in which X occurs and is not X itself. Checking for this condition is known as the occurs check. If the occurs check is not used and a variable is allowed to unify with a term in which it appears, the proof procedure becomes unsound, as shown in the following example.

Example 13.31.

Consider the knowledge base with only one clause:

lt(X,s(X)).

Suppose the intended interpretation is the domain of integers in which lt means “less than” and s(X) denotes the integer after X. The query 𝖺𝗌𝗄 lt(Y,Y) should fail because it is false in the intended interpretation; there is no number less than itself. However, if X and s(X) could unify, this query would succeed. In this case, the proof procedure would be unsound because something could be derived that is false in a model of the axioms.

The unification algorithm of Figure 13.3 requires one change to find the most general unifier of two terms with function symbols. The algorithm should return if it selects an equality α=β, where α is a variable and β is a term that is not α, but contains α (or the other way around). This step is the occurs check. The occurs check is sometimes omitted (e.g., in Prolog), because removing it makes the proof procedure more efficient, even though removing it does make the proof procedure unsound.

The following example shows the details of SLD resolution with function symbols.

Example 13.32.

Consider the clauses

append(c(A,X),Y,c(A,Z))
    append(X,Y,Z).
append(nil,Z,Z).

For now, ignore what this may mean. Like the computer, treat this as a problem of symbol manipulation. Consider the following query:

𝘢𝘴𝘬 append(F,c(L,nil),c(l,c(i,c(s,c(t,nil))))).

The following is a derivation:

yes(F,L)append(F,c(L,nil),c(l,c(i,c(s,c(t,nil)))))
    resolve with append(c(A1,X1),Y1,c(A1,Z1))append(X1,Y1,Z1)
    substitution: {F/c(l,X1),Y1/c(L,nil),A1/l,Z1/c(i,c(s,c(t,nil)))}
yes(c(l,X1),L)append(X1,c(L,nil),c(i,c(s,c(t,nil))))
    resolve with append(c(A2,X2),Y2,c(A2,Z2))append(X2,Y2,Z2)
    substitution: {X1/c(i,X2),Y2/c(L,nil),A2/i,Z2/c(s,c(t,nil))}
yes(c(l,c(i,X2)),L)append(X2,c(L,nil),c(s,c(t,nil)))
    resolve with append(c(A3,X3),Y3,c(A3,Z3))append(X3,Y3,Z3)
    substitution: {X2/c(s,X3),Y3/c(L,nil),A3/s,Z3/c(t,nil)}
yes(c(l,c(i,c(s,X3))),L)append(X3,c(L,nil),c(t,nil))

At this stage both clauses are applicable. Choosing the first clause gives

    resolve with append(c(A4,X4),Y4,c(A4,Z4))append(X4,Y4,Z4)
    substitution: {X3/c(t,X4),Y4/c(L,nil),A4/t,Z4/nil}
yes(c(l,c(i,c(s,X3))),L)append(X4,c(L,nil),nil)

At this point, there are no clauses whose head unifies with the atom in the generalized answer clause’s body. The proof fails.

Choosing the second clause instead of the first gives

    resolve with append(nil,Z5,Z5)
    substitution: {Z5/c(t,nil),X3/nil,L/t}
yes(c(l,c(i,c(s,nil))),t)

At this point, the proof succeeds, with answer F=c(l,c(i,c(s,nil))), L=t.

For the rest of this chapter, we use the “syntactic sugar” notation of Prolog for representing lists. The empty list, nil, is written as []. The list with first element E and the rest of the list R, which was cons(E,R) in Example 13.29 is now written as [ER]. There is one other notational simplification: [X[Y]] is written as [X,Y], where Y can be a sequence of values. For example, [a[]] is written as [a], and [b[a[]]] is written as [b,a]. The term [a[bC]] is written as [a,bC].

Example 13.33.

Using the list notation, append from the previous example can be written as

append([AX],Y,[AZ])
    append(X,Y,Z).
append([],Z,Z).

The query

𝘢𝘴𝘬 append(F,[L],[l,i,s,t])

has an answer F=[l,i,s], L=t. The proof is exactly as in the previous example. As far as the proof procedure is concerned, nothing has changed; there is just a renamed function symbol and constant.