foundations of computational agents
Both the bottom-up and top-down propositional proof procedures of Section 5.3.2 can be extended to Datalog. A proof procedure extended for variables must account for the fact that a free variable in a clause means that all instances of the clause are true. A proof may have to use different instances of the same clause in a single proof.
An instance of a clause is obtained by uniformly substituting terms for variables in the clause. All occurrences of a particular variable are replaced by the same term.
The specification of which value is assigned to each variable is called a substitution. A substitution is a set of the form , where each is a distinct variable and each is a term. The element is a binding for variable . A substitution is in normal form if no appears in any .
For example, is a substitution in normal form that binds to and binds to . The substitution is not in normal form, because the variable occurs both on the left and on the right of a binding.
The application of a substitution to expression , written , is an expression that is the same as the original expression except that each occurrence of in is replaced by the corresponding . The expression is called an instance of . If does not contain any variables, it is called a ground instance of .
Some applications of substitutions are
Substitutions can apply to clauses, atoms, and terms. For example, the result of applying the substitution to the clause
is the clause
A substitution is a unifier of expressions and if is identical to ; expressions and are said to unify. That is, a unifier of two expressions is a substitution that when applied to each expression results in the same expression.
Substitution is a unifier of and . Applying this substitution to either gives .
is a unifier of and as
Expressions can have many unifiers.
Atoms and have many unifiers, including , , , and . The last two unifiers are more general than the first two, because the first two both have the same as and the same as but make more commitments to what these values are.
Substitution is a most general unifier (MGU) of expressions and if
is a unifier of the two expressions, and
if substitution is also a unifier of and , then must be an instance of for all expressions .
and are both most general unifiers of and . is a unifier, but not a most general unifier, of these. The resulting applications are
Note that is an instance of and an instance of , but these are not instances of . and are instances of each other.
The definition of MGU refers to “all expressions ” to preclude a substitution such as being a most general unifier of and , because it affects other expressions such as .
Expression is a renaming of if they differ only in the names of variables. In this case, they are both instances of each other.
If two expressions have a unifier, they have at least one MGU. The expressions resulting from applying the MGUs to the expressions are all renamings of each other. That is, if and are both MGUs of expressions and , then is a renaming of .
The propositional bottom-up proof procedure can be extended to Datalog by using ground instances of the clauses. A ground instance of a clause is obtained by uniformly substituting constants for the variables in the clause. The constants required are those appearing in the knowledge base or in the query. If there are no constants in the knowledge base or the query, one must be invented.
Suppose the knowledge base is
The set of all ground instances is
The propositional bottom-up proof procedure of Section 5.3.2 can be applied to the grounding to derive , , , , , and as the ground instances that are logical consequences.
Suppose the knowledge base is
The bottom-up proof procedure for query “” must invent a new constant symbol, say . The set of all ground instances is then
The propositional bottom-up proof procedure will derive and .
If the query were “” the set of ground instances would change to reflect the constants and .
The bottom-up proof procedure applied to the grounding of the knowledge base is sound, because each instance of each rule is true in every model. This procedure is essentially the same as the variable-free case, but it uses the set of ground instances of the clauses, all of which are true because the variables in a clause are universally quantified.
This bottom-up procedure will eventually halt for Datalog because there are only finitely many grounded atoms, and one ground atom is added to the consequence set each time through the loop.
This procedure is also complete for ground atoms. That is, if a ground atom is a consequence of the knowledge base, it will be derived. To prove this, as in the propositional case, construct a particular generic model. Recall that a model specifies the domain, what the constants denote, and what is true. A Herbrand interpretation is an interpretation where the domain is symbolic and consists of all constants of the language. A constant is invented if there are no constants in the knowledge base or the query. In a Herbrand interpretation, each constant denotes itself. Thus, in the definition of an interpretation, and are fixed for a given program, and all that needs to be specified is , which defines the predicate symbols.
Consider the Herbrand interpretation where the true atoms are the ground instances of the relations that are eventually derived by the bottom-up procedure. It is easy to see that this Herbrand interpretation is a model of the rules given. As in the variable-free case, it is a minimal model in that it has the fewest true atoms of any model. If for ground atom , then is true in the minimal model and, thus, is eventually derived.
A computer does not need to reason at the propositional level by grounding. Instead, it can reason in terms of variables instead of multiple ground instances. Sometimes, it needs to replace variables by other variables or by constants.
The problem of unification is the following: given two atoms or terms, determine whether they unify, and, if they do, return a unifier of them. The unification algorithm finds a most general unifier (MGU) of two atoms or returns if they do not unify.
The unification algorithm is given in Figure 15.4. is a set of equality statements implying and implied by the unification of and . is a substitution in normal form; if is in the substitution , then, by construction, is a variable that does not appear elsewhere in or in . In line 20, and must have the same predicate and the same number of arguments; otherwise the unification fails.
Consider the call . Initially is . The first time through the while loop, becomes . Suppose is selected next. Then becomes and becomes . Suppose is selected. Then is replaced by in and . becomes and becomes . Finally is selected, is replaced by , becomes , and becomes empty. The substitution is returned as an MGU.
Consider unifying with . starts off as . In the next step, becomes . Then is replaced by in , and becomes . Then is replaced by in , and becomes , and then is returned indicating that there is no unifier.
The top-down proof procedure for propositional definite clauses can be extended to handle variables by allowing instances of rules to be used in the derivation.
A generalized answer clause is of the form
where are terms and are atoms. The use of enables answer extraction: determining which instances of the query variables are a logical consequence of the knowledge base.
Initially, the generalized answer clause for query is
where are the variables that appear in the query . Intuitively this means that an instance of is true if the corresponding instance of the query is true.
The proof procedure maintains a current generalized answer clause.
At each stage, the algorithm selects an atom in the body of the generalized answer clause. It then chooses a clause in the knowledge base whose head unifies with the atom.
The SLD resolution where SLD stands for selection linear definite, of the generalized answer clause
on with the chosen clause
where and have most general unifier , is the answer clause
where the body of the chosen clause has replaced in the answer clause, and the MGU is applied to the whole answer clause.
An SLD derivation is a sequence of generalized answer clauses , , , such that
is the answer clause corresponding to the original query. If the query is , with free variables , the initial generalized answer clause is
is obtained by selecting an atom in the body of ; choosing a copy of a clause in the knowledge base whose head, , unifies with ; replacing with the body, ; and applying the unifier to the whole resulting answer clause.
The main difference between this and the propositional top-down proof procedure is that, for clauses with variables, the proof procedure must take copies of clauses from the knowledge base. The copying renames the variables in the clause with new names. This is both to remove name clashes between variables and because a single proof may use different instances of a clause.
is an answer. That is, it is of the form
When this occurs, the algorithm returns the answer
Notice how the answer is extracted; the arguments to keep track of the instances of the variables in the initial query that lead to a successful proof.
Figure 15.5 gives a non-deterministic algorithm that answers queries by searching for SLD derivations. This is non-deterministic in the sense that all derivations can be found by making appropriate choices that do not fail. If all choices fail, the algorithm fails, and there are no derivations. The “choose” on line 13 is implemented using search, as in Example 5.12. Recall that returns an MGU of and , if there is one, and if they do not unify. The algorithm for is given in Figure 15.4.
The singer Shakira was born in Barranquilla, the capital of Atlántico Department in Colombia. It follows that she was born in Colombia and so also in South America. Consider the following propositions (as part of a larger knowledge base):
select leftmost conjunct |
Note that this derivation used two instances of the rule
One instance eventually substituted for , and one instance substituted for .