15.5 Proofs and Substitutions

Both the bottom-up and top-down propositional proof procedures of Section 5.3.2 can be extended to Datalog. A proof procedure extended for variables must account for the fact that a free variable in a clause means that all instances of the clause are true. A proof may have to use different instances of the same clause in a single proof.

15.5.1 Instances and Substitutions

An instance of a clause is obtained by uniformly substituting terms for variables in the clause. All occurrences of a particular variable are replaced by the same term.

The specification of which value is assigned to each variable is called a substitution. A substitution is a set of the form {V1/t1,,Vn/tn}, where each Vi is a distinct variable and each ti is a term. The element Vi/ti is a binding for variable Vi. A substitution is in normal form if no Vi appears in any tj.

Example 15.14.

For example, {X/Y,Z/chile} is a substitution in normal form that binds X to Y and binds Z to chile. The substitution{X/Y,Z/X} is not in normal form, because the variable X occurs both on the left and on the right of a binding.

The application of a substitution σ={V1/t1,,Vn/tn} to expression e, written eσ, is an expression that is the same as the original expression e except that each occurrence of Vi in e is replaced by the corresponding ti. The expression eσ is called an instance of e. If eσ does not contain any variables, it is called a ground instance of e.

Example 15.15.

Some applications of substitutions are

borders(peru,X){X/chile}=borders(peru,chile).
borders(Y,chile){Y/peru}=borders(peru,chile).
borders(peru,X){Y/peru,Z/X}=borders(peru,X).
p(X,X,Y,Y,Z){X/Z,Y/brazil}=p(Z,Z,brazil,brazil,Z).

Substitutions can apply to clauses, atoms, and terms. For example, the result of applying the substitution {X/Y,Z/peru} to the clause

p(X,Y)q(peru,Z,X,Y,Z)

is the clause

p(Y,Y)q(peru,peru,Y,Y,peru).

A substitution σ is a unifier of expressions e1 and e2 if e1σ is identical to e2σ; expressions e1 and e2 are said to unify. That is, a unifier of two expressions is a substitution that when applied to each expression results in the same expression.

Example 15.16.

Substitution {X/chile,Y/peru} is a unifier of borders(peru,X) and borders(Y,chile). Applying this substitution to either gives borders(peru,chile).

{X/a,Y/b} is a unifier of t(a,Y,c) and t(X,b,c) as

t(a,Y,c){X/a,Y/b}=t(X,b,c){X/a,Y/b}=t(a,b,c).

Expressions can have many unifiers.

Example 15.17.

Atoms p(X,Y) and p(Z,Z) have many unifiers, including {X/b,Y/b,Z/b}, {X/c,Y/c,Z/c}, {X/Z,Y/Z}, and {Y/X,Z/X}. The last two unifiers are more general than the first two, because the first two both have X the same as Z and Y the same as Z but make more commitments to what these values are.

Substitution σ is a most general unifier (MGU) of expressions e1 and e2 if

  • σ is a unifier of the two expressions, and

  • if substitution σ is also a unifier of e1 and e2, then eσ must be an instance of eσ for all expressions e.

Example 15.18.

{X/Z,Y/Z} and {Z/X,Y/X} are both most general unifiers of p(X,Y) and p(Z,Z). {X/a,Y/a,Z/a} is a unifier, but not a most general unifier, of these. The resulting applications are

p(X,Y){X/Z,Y/Z}=p(Z,Z){X/Z,Y/Z}=p(Z,Z)
p(X,Y){Z/X,Y/X}=p(Z,Z){Z/X,Y/X}=p(X,X)
p(X,Y){X/a,Y/a,Z/a}=p(Z,Z){X/a,Y/a,Z/a}=p(a,a).

Note that p(a,a) is an instance of p(Z,Z) and an instance of p(X,X), but these are not instances of p(a,a). p(Z,Z) and p(X,X) are instances of each other.

The definition of MGU refers to “all expressions e” to preclude a substitution such as {X/Z,Y/Z,W/a} being a most general unifier of p(X,Y) and p(Z,Z), because it affects other expressions such as r(W).

Expression e1 is a renaming of e2 if they differ only in the names of variables. In this case, they are both instances of each other.

If two expressions have a unifier, they have at least one MGU. The expressions resulting from applying the MGUs to the expressions are all renamings of each other. That is, if σ and σ are both MGUs of expressions e1 and e2, then e1σ is a renaming of e1σ.

15.5.2 Bottom-Up Procedure for Datalog

The propositional bottom-up proof procedure can be extended to Datalog by using ground instances of the clauses. A ground instance of a clause is obtained by uniformly substituting constants for the variables in the clause. The constants required are those appearing in the knowledge base or in the query. If there are no constants in the knowledge base or the query, one must be invented.

Example 15.19.

Suppose the knowledge base is

q(a).q(b).r(a).s(W)r(W).p(X,Y)q(X)s(Y).

The set of all ground instances is

q(a).q(b).r(a).s(a)r(a).s(b)r(b).p(a,a)q(a)s(a).p(a,b)q(a)s(b).p(b,a)q(b)s(a).p(b,b)q(b)s(b).

The propositional bottom-up proof procedure of Section 5.3.2 can be applied to the grounding to derive q(a), q(b), r(a), s(a), p(a,a), and p(b,a) as the ground instances that are logical consequences.

Example 15.20.

Suppose the knowledge base is

p(X,Y).
gp(W,W).

The bottom-up proof procedure for query “ask g” must invent a new constant symbol, say c. The set of all ground instances is then

p(c,c).
gp(c,c).

The propositional bottom-up proof procedure will derive p(c,c) and g.

If the query were “ask p(b,d)” the set of ground instances would change to reflect the constants b and d.

The bottom-up proof procedure applied to the grounding of the knowledge base is sound, because each instance of each rule is true in every model. This procedure is essentially the same as the variable-free case, but it uses the set of ground instances of the clauses, all of which are true because the variables in a clause are universally quantified.

This bottom-up procedure will eventually halt for Datalog because there are only finitely many grounded atoms, and one ground atom is added to the consequence set each time through the loop.

This procedure is also complete for ground atoms. That is, if a ground atom is a consequence of the knowledge base, it will be derived. To prove this, as in the propositional case, construct a particular generic model. Recall that a model specifies the domain, what the constants denote, and what is true. A Herbrand interpretation is an interpretation where the domain is symbolic and consists of all constants of the language. A constant is invented if there are no constants in the knowledge base or the query. In a Herbrand interpretation, each constant denotes itself. Thus, in the definition of an interpretation, D and ϕ are fixed for a given program, and all that needs to be specified is π, which defines the predicate symbols.

Consider the Herbrand interpretation where the true atoms are the ground instances of the relations that are eventually derived by the bottom-up procedure. It is easy to see that this Herbrand interpretation is a model of the rules given. As in the variable-free case, it is a minimal model in that it has the fewest true atoms of any model. If KBg for ground atom g, then g is true in the minimal model and, thus, is eventually derived.

15.5.3 Unification

A computer does not need to reason at the propositional level by grounding. Instead, it can reason in terms of variables instead of multiple ground instances. Sometimes, it needs to replace variables by other variables or by constants.

The problem of unification is the following: given two atoms or terms, determine whether they unify, and, if they do, return a unifier of them. The unification algorithm finds a most general unifier (MGU) of two atoms or returns if they do not unify.

1: procedure Unify(t1,t2)
2:      Inputs
3:          t1,t2: atoms or terms      
4:      Output
5:          most general unifier of t1 and t2 if it exists or otherwise
6:      Local
7:          E: a set of equality statements
8:          S: substitution      
9:      E{t1=t2}
10:      S={}
11:      while E{} do
12:          select and remove α=β from E
13:          if β is not identical to α then
14:               if α is a variable then
15:                   replace α with β everywhere in E and S
16:                   S{α/β}S
17:               else if β is a variable then
18:                   replace β with α everywhere in E and S
19:                   S{β/α}S
20:               else if α is p(α1,,αn) and β is p(β1,,βn) then
21:                   EE{α1=β1,,αn=βn}
22:               else
23:                   return                             
24:      return S
Figure 15.4: Unification algorithm for Datalog

The unification algorithm is given in Figure 15.4. E is a set of equality statements implying and implied by the unification of t1 and t2. S is a substitution in normal form; if α/β is in the substitution S, then, by construction, α is a variable that does not appear elsewhere in S or in E. In line 20, α and β must have the same predicate and the same number of arguments; otherwise the unification fails.

Example 15.21.

Consider the call Unify(p(X,Y,Y),p(a,Z,b)). Initially E is {p(X,Y,Y)=p(a,Z,b)}. The first time through the while loop, E becomes {X=a,Y=Z,Y=b}. Suppose X=a is selected next. Then S becomes {X/a} and E becomes {Y=Z,Y=b}. Suppose Y=Z is selected. Then Y is replaced by Z in S and E. S becomes {X/a,Y/Z} and E becomes {Z=b}. Finally Z=b is selected, Z is replaced by b, S becomes {X/a,Y/b,Z/b}, and E becomes empty. The substitution {X/a,Y/b,Z/b} is returned as an MGU.

Consider unifying p(a,Y,Y) with p(Z,Z,b). E starts off as {p(a,Y,Y)=p(Z,Z,b)}. In the next step, E becomes {a=Z,Y=Z,Y=b}. Then Z is replaced by a in E, and E becomes {Y=a,Y=b}. Then Y is replaced by a in E, and E becomes {a=b}, and then is returned indicating that there is no unifier.

15.5.4 Definite Resolution with Variables

The top-down proof procedure for propositional definite clauses can be extended to handle variables by allowing instances of rules to be used in the derivation.

A generalized answer clause is of the form

yes(t1,,tk)a1a2am

where t1,,tk are terms and a1,,am are atoms. The use of yes enables answer extraction: determining which instances of the query variables are a logical consequence of the knowledge base.

Initially, the generalized answer clause for query q is

yes(V1,,Vk)q

where V1,,Vk are the variables that appear in the query q. Intuitively this means that an instance of yes(V1,,Vk) is true if the corresponding instance of the query is true.

The proof procedure maintains a current generalized answer clause.

At each stage, the algorithm selects an atom in the body of the generalized answer clause. It then chooses a clause in the knowledge base whose head unifies with the atom.

The SLD resolution where SLD stands for selection linear definite, of the generalized answer clause

yes(t1,,tk)a1a2am

on a1 with the chosen clause

ab1bp

where a1 and a have most general unifier σ, is the answer clause

(yes(t1,,tk)b1bpa2am)σ

where the body of the chosen clause has replaced a1 in the answer clause, and the MGU σ is applied to the whole answer clause.

An SLD derivation is a sequence of generalized answer clauses γ0, γ1, , γn such that

  • γ0 is the answer clause corresponding to the original query. If the query is q, with free variables V1,,Vk, the initial generalized answer clause γ0 is

    yes(V1,,Vk)q.
  • γi is obtained by selecting an atom a1 in the body of γi1; choosing a copy of a clause ab1bp in the knowledge base whose head, a, unifies with ai; replacing a1 with the body, b1bp; and applying the unifier to the whole resulting answer clause.

    The main difference between this and the propositional top-down proof procedure is that, for clauses with variables, the proof procedure must take copies of clauses from the knowledge base. The copying renames the variables in the clause with new names. This is both to remove name clashes between variables and because a single proof may use different instances of a clause.

  • γn is an answer. That is, it is of the form

    yes(t1,,tk).

    When this occurs, the algorithm returns the answer

    V1=t1,,Vk=tk.

Notice how the answer is extracted; the arguments to yes keep track of the instances of the variables in the initial query that lead to a successful proof.

1: non-deterministic procedure Prove_datalog_TD(KB,q)
2:      Inputs
3:          KB: a set of definite clauses
4:          Query q: a set of atoms to prove, with variables V1,,Vk      
5:      Output
6:          substitution θ if KBqθ and fail otherwise
7:      Local
8:          G is a generalized answer clause      
9:      Set G to generalized answer clause yes(V1,,Vk)q
10:      while G is not an answer do
11:          Suppose G is yes(t1,,tk)a1a2am
12:          select atom a1 in the body of G
13:          choose clause ab1bp in KB
14:          Rename all variables in ab1bp to have new names
15:          Let σ be Unify(a1,a). Fail if Unify returns .
16:          G:=(yes(t1,,tk)b1bpa2am)σ      
17:      return {V1=t1,,Vk=tk} where G is yes(t1,,tk)
Figure 15.5: Top-down definite-clause proof procedure for Datalog

Figure 15.5 gives a non-deterministic algorithm that answers queries by searching for SLD derivations. This is non-deterministic in the sense that all derivations can be found by making appropriate choices that do not fail. If all choices fail, the algorithm fails, and there are no derivations. The “choose” on line 13 is implemented using search, as in Example 5.12. Recall that Unify(ai,a) returns an MGU of ai and a, if there is one, and if they do not unify. The algorithm for Unify is given in Figure 15.4.

Example 15.22.

The singer Shakira was born in Barranquilla, the capital of Atlántico Department in Colombia. It follows that she was born in Colombia and so also in South America. Consider the following propositions (as part of a larger knowledge base):

born_in(shakira,barranquilla).
born_in(P,L)part_of(S,L)born_in(P,S).
part_of(barranquilla,atlantico).
part_of(atlantico,colombia).
part_of(colombia,south_america).

A query to ask who is born in Colombia is

ask born_in(P,colombia).

Figure 15.6 shows a successful derivation with answer P=shakira.

yes(P)born_in(P,colombia)
       resolve with: born_in(P1,L1)part_of(S1,L1)born_in(P1,S1)
       substitution: {P1/P,L1/colombia}
yes(P)part_of(S1,colombia)born_in(P,S1)
       select leftmost conjunct
       resolve with: part_of(atlantico,colombia)
       substitution: {S1/atlantico}
yes(P)born_in(P,atlantico)
       resolve with: born_in(P2,L2)part_of(S2,L2)born_in(P2,S2)
       substitution: {P2/P,L2/atlantico}
yes(P)part_of(S2,atlantico)born_in(P,S2)
       resolve with: part_of(barranquilla,atlantico)
       substitution: {S2/barranquilla}
yes(P)born_in(P,barranquilla)
       resolve with: bornin(shakira,barranquilla)
       substitution: {P/shakira}
yes(shakira)
Figure 15.6: A derivation for query born_in(P,colombia)

Note that this derivation used two instances of the rule

born_in(P,L)part_of(S,L)born_in(P,S).

One instance eventually substituted colombia for S, and one instance substituted atlantico for S.