13 Individuals and Relations 13.4.4 Definite Resolution with Variables 13.5.1 Proof Procedures with Function Symbols

The third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including full text).

13.5 Function Symbols

Datalog requires a name, using a constant, for every individual about which the system reasons. Often it is simpler to identify an individual in terms of its components, rather than requiring a separate constant for each individual.

Example 13.24.

In many domains, you want to be able to refer to a time as an individual. You may want to say that some course is held at 11:30 a.m. You do not want a separate constant for each possible time, although this is possible. It is better to define times in terms of, say, the number of hours past midnight and the number of minutes past the hour. Similarly, you may want to reason with facts about particular dates. You cannot give a constant for each date, as there are infinitely many possible dates. It is easier to define a date in terms of the year, the month, and the day.

Using a constant to name each individual means that the knowledge base can only represent a finite number of individuals, and the number of individuals is fixed when the knowledge base is built. However, you may want to reason about a potentially infinite set of individuals.

Example 13.25.

Suppose you want to build a system that takes questions in English and answers them by consulting an online database. In this case, each sentence is an individual. You do not want to have to give each sentence its own name, because there are too many English sentences to name them all. It may be better to name the words and then to specify a sentence in terms of the sequence of words in the sentence. This approach may be more practical because there are far fewer words to name than sentences, and each word has its own natural name. You may also want to specify the words in terms of the letters in the word or in terms of their constituent parts.

Example 13.26.

You may want to reason about lists of students. For example, you may be required to derive the average mark of a class of students. A list of students is an individual that has properties, such as its length and its seventh element. Although it may be possible to name each list, it is very inconvenient to do so. It is much better to have a way to describe lists in terms of their elements.

Function symbols allow you to describe individuals indirectly. Rather than using a constant to describe an individual, an individual is described in terms of other individuals.

Syntactically a function symbol is a word starting with a lower-case letter. We extend the definition of a term so that a term is either a variable, a constant, or of the form $f(t_{1},\ldots,t_{n})$ , where $f$ is a function symbol and each $t_{i}$ is a term. Apart from extending the definition of terms, the language stays the same.

Terms only appear within predicate symbols. You do not write clauses that imply terms. You may, however, write clauses that include atoms that use function symbols.

The semantics of Datalog must be expanded to reflect the new syntax. The definition of $\phi$ is extended so that $\phi$ also assigns to each n-ary function symbol a function from $D^{n}$ into $D$ . A constant can be seen as a 0-ary function symbol (i.e., one with no arguments). Thus, $\phi$ specifies which individual is denoted by each ground term.

Example 13.27.

Suppose you want to define dates, such as 20 July 1969, which is the date the first time a human was on the moon. You can use the function symbol $c e$ (common era) so that $ce(Y,M,D)$ denotes a date with year $Y$ , month $M$ and day $D$ . For example, $ce(1969,jul,20)$ may denote 20 July 1969. Similarly, you can define the symbol $b c e$ to denote the date before the common era.

The only way to use the function symbol is to write clauses that define relations using the function symbol. There is no notion of defining the $c e$ function; dates are not in a computer any more than people are.

To use function symbols, you can write clauses that are quantified over the arguments of the function symbol. For example, Figure 13.6 defines the $before(D_{1},D_{2})$ relation that is true if date $D_{1}$ is before date $D_{2}$ in a day.

% $before(D_{1},D_{2})$ is true if date $D_{1}$ is before date $D_{2}$

	$\displaystyle{before(ce(Y1,M1,D1),ce(Y2,M2,D2))\leftarrow\mbox{}}$
	$\displaystyle\ \ \ \ {Y1<Y2.}$
	$\displaystyle{before(ce(Y,M1,D1),ce(Y,M2,D2))\leftarrow\mbox{}}$
	$\displaystyle\ \ \ \ {month(M1,N1)\wedge\mbox{}}$
	$\displaystyle\ \ \ \ {month(M2,N2)\wedge\mbox{}}$
	$\displaystyle\ \ \ \ {N1<N2.}$
	$\displaystyle{before(ce(Y,M,D1),ce(Y,M,D2))\leftarrow\mbox{}}$
	$\displaystyle\ \ \ \ {D1<D2.}$

% $month(M,N)$ is true if month $M$ is the $N$ th month of the year.

	$\displaystyle{month(jan,1).}$
	$\displaystyle{month(feb,2).}$
	$\displaystyle{month(mar,3).}$
	$\displaystyle{month(apr,4).}$
	$\displaystyle{month(may,5).}$
	$\displaystyle{month(jun,6).}$
	$\displaystyle{month(jul,7).}$
	$\displaystyle{month(aug,8).}$
	$\displaystyle{month(sep,9).}$
	$\displaystyle{month(oct,10).}$
	$\displaystyle{month(nov,11).}$
	$\displaystyle{month(dec,12).}$

Figure 13.6: Axiomatizing a “before” relation for dates in the common era

This assumes the predicate “ $<$ ” represents the relation “less than” between integers. This could be represented in terms of clauses, but is often predefined, as it is in Prolog. The months are represented by constants that consist of the first three letters of the month.

A knowledge base consisting of clauses with function symbols can compute any computable function. Thus, a knowledge base can be interpreted as a program, called a logic program. Logic programs are Turing complete; they can compute any function computable on a digital computer.

This expansion of the language has a major impact. With just one function symbol and one constant, the language contains infinitely many ground terms and infinitely many ground atoms. The infinite number of terms can be used to describe an infinite number of individuals.

Function symbols are used to build data structures, as in the following example.

Example 13.28.

A tree is a useful data structure. You could use a tree to build a syntactic representation of a sentence for a natural language processing system. You could decide that a labeled tree is either of the form $node(N,LT,RT)$ or of the form $leaf(L)$ . Thus, $n o d e$ is a function from a name, a left tree, and a right tree into a tree. The function symbol $l e a f$ denotes a function from the label of a leaf node into a tree.

The relation $at\_leaf(L,T)$ is true if label $L$ is the label of a leaf in tree $T$ . It can be defined by

	$\displaystyle{at\_leaf(L,leaf(L)).}$
	$\displaystyle{at\_leaf(L,node(N,LT,RT))\leftarrow\mbox{}}$
	$\displaystyle\ \ \ \ {at\_leaf(L,LT).}$
	$\displaystyle{at\_leaf(L,node(N,LT,RT))\leftarrow\mbox{}}$
	$\displaystyle\ \ \ \ {at\_leaf(L,RT).}$

This is an example of a structural recursive program. The rules cover all of the cases for each of the structures representing trees.

The relation $in\_tree(L,T)$ , which is true if label $L$ is the label of an interior node of tree $T$ , can be defined by

	$\displaystyle{in\_tree(L,node(L,LT,RT)).}$
	$\displaystyle{in\_tree(L,node(N,LT,RT))\leftarrow\mbox{}}$
	$\displaystyle\ \ \ \ {in\_tree(L,LT).}$
	$\displaystyle{in\_tree(L,node(N,LT,RT))\leftarrow\mbox{}}$
	$\displaystyle\ \ \ \ {in\_tree(L,RT).}$

Example 13.29.

A list is an ordered sequence of elements. You can reason about lists using just function symbols and constants, without the notion of a list being predefined in the language. A list is either the empty list or an element followed by a list. You can invent a constant to denote the empty list. Suppose you use the constant $n i l$ to denote the empty list. You can choose a function symbol, say $cons(Hd,Tl)$ , with the intended interpretation that it denotes a list with first element $H d$ and rest of the list $T l$ . The list containing the elements $a$ , $b$ , $c$ would then be represented as

cons(a,cons(b,cons(c,nil))).

To use lists, one must write predicates that do something with them. For example, the relation $append(X,Y,Z)$ that is true when $X$ , $Y$ , and $Z$ are lists, such that $Z$ contains the elements of $X$ followed by the elements of $Z$ , can be defined recursively by

	$\displaystyle{append(nil,L,L).}$
	$\displaystyle{append(cons(Hd,X),Y,cons(Hd,Z))\leftarrow\mbox{}}$
	$\displaystyle\ \ \ \ {append(X,Y,Z).}$

There is nothing special about $c o n s$ or $n i l$ ; we could have just as well used $f o o$ and $b a r$ .

First-Order and Second-Order Logic

First-order predicate calculus is a logic that extends propositional calculus to include atoms with function symbols and logical variables. All logical variables must have explicit quantification in terms of “for all” ( $\forall$ ) and “there exists” ( $\exists$ ). The semantics of first-order predicate calculus is like the semantics of logic programs presented in this chapter, but with a richer set of operators.

The language of logic programs forms a pragmatic subset of first-order predicate calculus, which has been developed because it is useful for many tasks. First-order predicate calculus can be seen as a language that adds disjunction and explicit quantification to logic programs.

First-order logic is first order because it allows quantification over individuals in the domain. First-order logic allows neither predicates as variables nor quantification over predicates.

Second-order logic allows for quantification over first-order relations and predicates whose arguments are first-order relations. These are second-order relations. For example, the second-order logic formula

\forall R\ symmetric(R)\iff(\forall X\forall Y\ R(X,Y)\rightarrow R(Y,X))

defines the second-order relation $s y m m e t r i c$ , which is true if its argument is a symmetric relation.

Second-order logic seems necessary for many applications because transitive closure is not first-order definable. For example, suppose you want $b e f o r e$ to be the transitive closure of $n e x t$ , where $next(X,s(X))$ is true. Think of $n e x t$ meaning the “next millisecond” and $b e f o r e$ denoting “before.” The natural first-order definition would be the definition

\forall X\forall Y\ before(X,Y)\iff\left(Y=s(X)\vee before(s(X),Y)\right).

(13.1)

This expression does not accurately capture the definition, because, for example,

\forall X\forall Y\ before(X,Y)\rightarrow\exists W\ Y=s(W)

does not logically follow from Formula (13.1), because there are nonstandard models of Formula (13.1) with $Y$ denoting infinity. To capture the transitive closure, you require a formula stating that $b e f o r e$ is the minimal predicate that satisfies the definition. This can be stated using second-order logic.

First-order logic is semi-decidable, which means that a sound and complete proof procedure exists in which every true statement can be proved, but it may not halt. Second-order logic is undecidable; no sound and complete proof procedure can be implemented on a Turing machine.

13.5.1 Proof Procedures with Function Symbols

Artificial Intelligence 2E