13.3 Datalog: A Relational Rule Language

This section expands the syntax for the propositional definite clause language. The syntax is based on normal mathematical notation for predicate symbols but follows Prolog’s convention for variables.

The syntax of Datalog is given by the following, where a word is a sequence of letters, digits, or an underscore (“$\_$”):

• A logical variable is a word starting with an upper-case letter or with “_”. For example $X$, $Room$, $B4$, $Raths$, and $The\_big\_guy$ are all variables.

• A constant is a word that starts with a lower-case letter, or is a number constant or a string.

• A predicate symbol is a word that starts with a lower-case letter. Constants and predicate symbols are distinguishable by their context in a knowledge base.

For example, $kim$, $r123$, $f$, $grandfather$, and $borogroves$ can be constants or predicate symbols, depending on the context; $725$ is a constant.

• A term is either a variable or a constant.

For example $X$, $kim$, $cs422$, $mome$, or $Raths$ can be terms.

• An atomic symbol, or simply an atom, is of the form $p$ or $p(t_{1},\ldots,t_{n})$, where $p$ is a predicate symbol and each $t_{i}$ is a term. Each $t_{i}$ is called an argument to the predicate.

For example, $teaches(sue,cs422)$, $in(kim,r123)$, $father(bill,Y)$, $happy(C)$, $outgrabe(mome,Raths)$, and $sunny$ can all be atoms. From context in the atom $outgrabe(mome,Raths)$, the symbol $outgrabe$ is a predicate symbol and $mome$ is a constant.

The notions of definite clause, rule, query, and knowledge base are the same as for propositional definite clauses but with the expanded definition of atom. The definitions are repeated here.

• A definite clause is of the form

 $h\leftarrow\mbox{}a_{1}\wedge\mbox{}\ldots\wedge\mbox{}a_{m}.$

where $h$ is an atom, the head of the clause, and each $a_{i}$ is an atom. It can be read “$h$ if $a_{1}$ and …and $a_{m}$”.

If $m>0$, the clause is called a rule. $a_{1}\wedge\mbox{}\ldots\wedge\mbox{}a_{m}$ is the body of the clause.

If $m=0$ the arrow can be omitted and the clause is called atomic clause or fact. An atomic clause has an empty body.

• A knowledge base is a set of definite clauses.

• A query is of the form

 $\mbox{{ask}~{}}~{}a_{1}\wedge\mbox{}\ldots\wedge\mbox{}a_{m}.$
• An expression is either a term, an atom, a definite clause, or a query.

In our examples, we will follow the Prolog convention that comments, which are ignored by the system, extend from a “%” to the end of the line.

Example 13.3.

The following is a knowledge base:

 $\displaystyle{grandfather(sam,X)\leftarrow\mbox{}father(sam,Y)\wedge\mbox{}% parent(Y,X).}$ $\displaystyle{in(kim,R)\leftarrow\mbox{}teaches(kim,cs422)\wedge\mbox{}in(cs42% 2,R).}$ $\displaystyle{slithy(toves)\leftarrow\mbox{}mimsy\wedge\mbox{}borogroves\wedge% \mbox{}outgrabe(mome,Raths).}$

From context, $sam$, $kim$, $cs422$, $toves$, and $mome$ are constants; $grandfather$, $father$, $parent$, $in$, $teaches$, $slithy$, $mimsy$, $borogroves$, and $outgrabe$ are predicate symbols; and $X$, $Y$, $R$ and $Raths$ are variables.

The first two clauses about Kim and Sam may make some intuitive sense, even though we have not explicitly provided any formal specification for the meaning of sentences of the definite clause language. However, regardless of the mnemonic names’ suggestiveness, as far as the computer is concerned, the first two clauses have no more meaning than the third. Meaning is provided only by virtue of a semantics.

An expression is ground if it does not contain any variables. For example, $teaches(chris,cs322)$ is ground, but $teaches(Prof,Course)$ is not ground.

The next section defines the semantics. We first consider ground expressions and then extend the semantics to include variables.