# 15.5 Proofs and Substitutions

Both the bottom-up and top-down propositional proof procedures of Section 5.3.2 can be extended to Datalog. A proof procedure extended for variables must account for the fact that a free variable in a clause means that all instances of the clause are true. A proof may have to use different instances of the same clause in a single proof.

## 15.5.1 Instances and Substitutions

An instance of a clause is obtained by uniformly substituting terms for variables in the clause. All occurrences of a particular variable are replaced by the same term.

The specification of which value is assigned to each variable is called a substitution. A substitution is a set of the form $\{V_{1}/t_{1},\ldots,V_{n}/t_{n}\}$, where each $V_{i}$ is a distinct variable and each $t_{i}$ is a term. The element $V_{i}/t_{i}$ is a binding for variable $V_{i}$. A substitution is in normal form if no $V_{i}$ appears in any $t_{j}$.

###### Example 15.14.

For example, $\{X/Y,Z/chile\}$ is a substitution in normal form that binds $X$ to $Y$ and binds $Z$ to $chile$. The substitution$\{X/Y,Z/X\}$ is not in normal form, because the variable $X$ occurs both on the left and on the right of a binding.

The application of a substitution $\sigma=\{V_{1}/t_{1},\ldots,V_{n}/t_{n}\}$ to expression $e$, written $e\sigma$, is an expression that is the same as the original expression $e$ except that each occurrence of $V_{i}$ in $e$ is replaced by the corresponding $t_{i}$. The expression $e\sigma$ is called an instance of $e$. If $e\sigma$ does not contain any variables, it is called a ground instance of $e$.

###### Example 15.15.

Some applications of substitutions are

 $borders(peru,X)\{X/chile\}=borders(peru,chile).$ $borders(Y,chile)\{Y/peru\}=borders(peru,chile).$ $borders(peru,X)\{Y/peru,Z/X\}=borders(peru,X).$ $p(X,X,Y,Y,Z)\{X/Z,Y/brazil\}=p(Z,Z,brazil,brazil,Z).$

Substitutions can apply to clauses, atoms, and terms. For example, the result of applying the substitution $\{X/Y,Z/peru\}$ to the clause

 $p(X,Y)\leftarrow\mbox{}q(peru,Z,X,Y,Z)$

is the clause

 $p(Y,Y)\leftarrow\mbox{}q(peru,peru,Y,Y,peru).$

A substitution $\sigma$ is a unifier of expressions $e_{1}$ and $e_{2}$ if $e_{1}\sigma$ is identical to $e_{2}\sigma$; expressions $e_{1}$ and $e_{2}$ are said to unify. That is, a unifier of two expressions is a substitution that when applied to each expression results in the same expression.

###### Example 15.16.

Substitution $\{X/chile,Y/peru\}$ is a unifier of $borders(peru,X)$ and $borders(Y,chile)$. Applying this substitution to either gives $borders(peru,chile)$.

$\{X/a,Y/b\}$ is a unifier of $t(a,Y,c)$ and $t(X,b,c)$ as

 $t(a,Y,c)\{X/a,Y/b\}=t(X,b,c)\{X/a,Y/b\}=t(a,b,c).$

Expressions can have many unifiers.

###### Example 15.17.

Atoms $p(X,Y)$ and $p(Z,Z)$ have many unifiers, including $\{X/b,Y/b,Z/b\}$, $\{X/c,Y/c,Z/c\}$, $\{X/Z,Y/Z\}$, and $\{Y/X,Z/X\}$. The last two unifiers are more general than the first two, because the first two both have $X$ the same as $Z$ and $Y$ the same as $Z$ but make more commitments to what these values are.

Substitution $\sigma$ is a most general unifier (MGU) of expressions $e_{1}$ and $e_{2}$ if

• $\sigma$ is a unifier of the two expressions, and

• if substitution $\sigma^{\prime}$ is also a unifier of $e_{1}$ and $e_{2}$, then $e\sigma^{\prime}$ must be an instance of $e\sigma$ for all expressions $e$.

###### Example 15.18.

$\{X/Z,Y/Z\}$ and $\{Z/X,Y/X\}$ are both most general unifiers of $p(X,Y)$ and $p(Z,Z)$. $\{X/a,Y/a,Z/a\}$ is a unifier, but not a most general unifier, of these. The resulting applications are

 $p(X,Y)\{X/Z,Y/Z\}=p(Z,Z)\{X/Z,Y/Z\}=p(Z,Z)$ $p(X,Y)\{Z/X,Y/X\}=p(Z,Z)\{Z/X,Y/X\}=p(X,X)$ $p(X,Y)\{X/a,Y/a,Z/a\}=p(Z,Z)\{X/a,Y/a,Z/a\}=p(a,a).$

Note that $p(a,a)$ is an instance of $p(Z,Z)$ and an instance of $p(X,X)$, but these are not instances of $p(a,a)$. $p(Z,Z)$ and $p(X,X)$ are instances of each other.

The definition of MGU refers to “all expressions $e$” to preclude a substitution such as $\{X/Z,Y/Z,W/a\}$ being a most general unifier of $p(X,Y)$ and $p(Z,Z)$, because it affects other expressions such as $r(W)$.

Expression $e_{1}$ is a renaming of $e_{2}$ if they differ only in the names of variables. In this case, they are both instances of each other.

If two expressions have a unifier, they have at least one MGU. The expressions resulting from applying the MGUs to the expressions are all renamings of each other. That is, if $\sigma$ and $\sigma^{\prime}$ are both MGUs of expressions $e_{1}$ and $e_{2}$, then $e_{1}\sigma$ is a renaming of $e_{1}\sigma^{\prime}$.

## 15.5.2 Bottom-Up Procedure for Datalog

The propositional bottom-up proof procedure can be extended to Datalog by using ground instances of the clauses. A ground instance of a clause is obtained by uniformly substituting constants for the variables in the clause. The constants required are those appearing in the knowledge base or in the query. If there are no constants in the knowledge base or the query, one must be invented.

###### Example 15.19.

Suppose the knowledge base is

 $\begin{array}[]{lll}{q(a).}&{q(b).}&{r(a).}\\ {s(W)\leftarrow\mbox{}r(W).}&{p(X,Y)\leftarrow\mbox{}q(X)\wedge\mbox{}s(Y).}% \end{array}$

The set of all ground instances is

 $\begin{array}[]{lll}{q(a).}&{q(b).}&{r(a).}\\ {s(a)\leftarrow\mbox{}r(a).}&{s(b)\leftarrow\mbox{}r(b).}&{p(a,a)\leftarrow% \mbox{}q(a)\wedge\mbox{}s(a).}\\ {p(a,b)\leftarrow\mbox{}q(a)\wedge\mbox{}s(b).}&{p(b,a)\leftarrow\mbox{}q(b)% \wedge\mbox{}s(a).}&{p(b,b)\leftarrow\mbox{}q(b)\wedge\mbox{}s(b).}\end{array}$

The propositional bottom-up proof procedure of Section 5.3.2 can be applied to the grounding to derive $q(a)$, $q(b)$, $r(a)$, $s(a)$, $p(a,a)$, and $p(b,a)$ as the ground instances that are logical consequences.

###### Example 15.20.

Suppose the knowledge base is

 $p(X,Y).$ $g\leftarrow\mbox{}p(W,W).$

The bottom-up proof procedure for query “$\mbox{{ask}~{}}~{}g$” must invent a new constant symbol, say $c$. The set of all ground instances is then

 $p(c,c).$ $g\leftarrow\mbox{}p(c,c).$

The propositional bottom-up proof procedure will derive $p(c,c)$ and $g$.

If the query were “$\mbox{{ask}~{}}~{}p(b,d)$” the set of ground instances would change to reflect the constants $b$ and $d$.

The bottom-up proof procedure applied to the grounding of the knowledge base is sound, because each instance of each rule is true in every model. This procedure is essentially the same as the variable-free case, but it uses the set of ground instances of the clauses, all of which are true because the variables in a clause are universally quantified.

This bottom-up procedure will eventually halt for Datalog because there are only finitely many grounded atoms, and one ground atom is added to the consequence set each time through the loop.

This procedure is also complete for ground atoms. That is, if a ground atom is a consequence of the knowledge base, it will be derived. To prove this, as in the propositional case, construct a particular generic model. Recall that a model specifies the domain, what the constants denote, and what is true. A Herbrand interpretation is an interpretation where the domain is symbolic and consists of all constants of the language. A constant is invented if there are no constants in the knowledge base or the query. In a Herbrand interpretation, each constant denotes itself. Thus, in the definition of an interpretation, $D$ and $\phi$ are fixed for a given program, and all that needs to be specified is $\pi$, which defines the predicate symbols.

Consider the Herbrand interpretation where the true atoms are the ground instances of the relations that are eventually derived by the bottom-up procedure. It is easy to see that this Herbrand interpretation is a model of the rules given. As in the variable-free case, it is a minimal model in that it has the fewest true atoms of any model. If $KB\models g$ for ground atom $g$, then $g$ is true in the minimal model and, thus, is eventually derived.

## 15.5.3 Unification

A computer does not need to reason at the propositional level by grounding. Instead, it can reason in terms of variables instead of multiple ground instances. Sometimes, it needs to replace variables by other variables or by constants.

The problem of unification is the following: given two atoms or terms, determine whether they unify, and, if they do, return a unifier of them. The unification algorithm finds a most general unifier (MGU) of two atoms or returns $\bot$ if they do not unify.

The unification algorithm is given in Figure 15.4. $E$ is a set of equality statements implying and implied by the unification of $t_{1}$ and $t_{2}$. $S$ is a substitution in normal form; if $\alpha/\beta$ is in the substitution $S$, then, by construction, $\alpha$ is a variable that does not appear elsewhere in $S$ or in $E$. In line 20, $\alpha$ and $\beta$ must have the same predicate and the same number of arguments; otherwise the unification fails.

###### Example 15.21.

Consider the call $Unify(p(X,Y,Y),p(a,Z,b))$. Initially $E$ is $\{p(X,Y,Y)=p(a,Z,b)\}$. The first time through the while loop, $E$ becomes $\{X=a,Y=Z,Y=b\}$. Suppose $X=a$ is selected next. Then $S$ becomes $\{X/a\}$ and $E$ becomes $\{Y=Z,Y=b\}$. Suppose $Y=Z$ is selected. Then $Y$ is replaced by $Z$ in $S$ and $E$. $S$ becomes $\{X/a,Y/Z\}$ and $E$ becomes $\{Z=b\}$. Finally $Z=b$ is selected, $Z$ is replaced by $b$, $S$ becomes $\{X/a,Y/b,Z/b\}$, and $E$ becomes empty. The substitution $\{X/a,Y/b,Z/b\}$ is returned as an MGU.

Consider unifying $p(a,Y,Y)$ with $p(Z,Z,b)$. $E$ starts off as $\{p(a,Y,Y)=p(Z,Z,b)\}$. In the next step, $E$ becomes $\{a=Z,Y=Z,Y=b\}$. Then $Z$ is replaced by $a$ in $E$, and $E$ becomes $\{Y=a,Y=b\}$. Then $Y$ is replaced by $a$ in $E$, and $E$ becomes $\{a=b\}$, and then $\bot$ is returned indicating that there is no unifier.

## 15.5.4 Definite Resolution with Variables

The top-down proof procedure for propositional definite clauses can be extended to handle variables by allowing instances of rules to be used in the derivation.

A generalized answer clause is of the form

 $yes(t_{1},\ldots,t_{k})\leftarrow\mbox{}a_{1}\wedge\mbox{}a_{2}\wedge\mbox{}% \ldots\wedge\mbox{}a_{m}$

where $t_{1},\ldots,t_{k}$ are terms and $a_{1},\ldots,a_{m}$ are atoms. The use of $yes$ enables answer extraction: determining which instances of the query variables are a logical consequence of the knowledge base.

Initially, the generalized answer clause for query $q$ is

 $yes(V_{1},\ldots,V_{k})\leftarrow\mbox{}q$

where $V_{1},\ldots,V_{k}$ are the variables that appear in the query $q$. Intuitively this means that an instance of $yes(V_{1},\ldots,V_{k})$ is true if the corresponding instance of the query is true.

The proof procedure maintains a current generalized answer clause.

At each stage, the algorithm selects an atom in the body of the generalized answer clause. It then chooses a clause in the knowledge base whose head unifies with the atom.

The SLD resolution where SLD stands for selection linear definite, of the generalized answer clause

 $yes(t_{1},\ldots,t_{k})\leftarrow\mbox{}a_{1}\wedge\mbox{}a_{2}\wedge\mbox{}% \ldots\wedge\mbox{}a_{m}$

on $a_{1}$ with the chosen clause

 $a\leftarrow\mbox{}b_{1}\wedge\mbox{}\ldots\wedge\mbox{}b_{p}$

where $a_{1}$ and $a$ have most general unifier $\sigma$, is the answer clause

 $(yes(t_{1},\ldots,t_{k})\leftarrow\mbox{}b_{1}\wedge\mbox{}\ldots\wedge\mbox{}% b_{p}\wedge\mbox{}a_{2}\wedge\mbox{}\ldots\wedge\mbox{}a_{m})\sigma$

where the body of the chosen clause has replaced $a_{1}$ in the answer clause, and the MGU $\sigma$ is applied to the whole answer clause.

An SLD derivation is a sequence of generalized answer clauses $\gamma_{0}$, $\gamma_{1}$, $\ldots$, $\gamma_{n}$ such that

• $\gamma_{0}$ is the answer clause corresponding to the original query. If the query is $q$, with free variables $V_{1},\ldots,V_{k}$, the initial generalized answer clause $\gamma_{0}$ is

 $yes(V_{1},\ldots,V_{k})\leftarrow\mbox{}q.$
• $\gamma_{i}$ is obtained by selecting an atom $a_{1}$ in the body of $\gamma_{i-1}$; choosing a copy of a clause $a\leftarrow\mbox{}b_{1}\wedge\mbox{}\ldots\wedge\mbox{}b_{p}$ in the knowledge base whose head, $a$, unifies with $a_{i}$; replacing $a_{1}$ with the body, $b_{1}\wedge\mbox{}\ldots\wedge\mbox{}b_{p}$; and applying the unifier to the whole resulting answer clause.

The main difference between this and the propositional top-down proof procedure is that, for clauses with variables, the proof procedure must take copies of clauses from the knowledge base. The copying renames the variables in the clause with new names. This is both to remove name clashes between variables and because a single proof may use different instances of a clause.

• $\gamma_{n}$ is an answer. That is, it is of the form

 $yes(t_{1},\ldots,t_{k})\leftarrow\mbox{}.$

When this occurs, the algorithm returns the answer

 $V_{1}=t_{1},\ldots,V_{k}=t_{k}.$

Notice how the answer is extracted; the arguments to $yes$ keep track of the instances of the variables in the initial query that lead to a successful proof.

Figure 15.5 gives a non-deterministic algorithm that answers queries by searching for SLD derivations. This is non-deterministic in the sense that all derivations can be found by making appropriate choices that do not fail. If all choices fail, the algorithm fails, and there are no derivations. The “choose” on line 13 is implemented using search, as in Example 5.12. Recall that $Unify(a_{i},a)$ returns an MGU of $a_{i}$ and $a$, if there is one, and $\bot$ if they do not unify. The algorithm for $Unify$ is given in Figure 15.4.

###### Example 15.22.

The singer Shakira was born in Barranquilla, the capital of Atlántico Department in Colombia. It follows that she was born in Colombia and so also in South America. Consider the following propositions (as part of a larger knowledge base):

 $born\_in(shakira,barranquilla).$ $born\_in(P,L)\leftarrow part\_of(S,L)\land born\_in(P,S).$ $part\_of(barranquilla,atlantico).$ $part\_of(atlantico,colombia).$ $part\_of(colombia,south\_america).$

A query to ask who is born in Colombia is

 $\mbox{{ask}~{}}~{}born\_in(P,colombia).$

Figure 15.6 shows a successful derivation with answer $P=shakira$.

Note that this derivation used two instances of the rule

 $born\_in(P,L)\leftarrow part\_of(S,L)\land born\_in(P,S).$

One instance eventually substituted $colombia$ for $S$, and one instance substituted $atlantico$ for $S$.