foundations of computational agents
The learning of Chapters 7 and 8 and the probabilistic reasoning of Chapter 9 were in terms of features and random variables. Neither entities, properties, nor relations are features or random variables, but they can be used to construct random variables.
Unfortunately, the name “variable” is used for random variables and logical variables, but these are not related; a logical variable denotes an entity and a random variable is a function on worlds or states. This chapter always distinguishes the two; for other sources you need to determine which is meant from the context.
The random variables from a knowledge graph (Section 16.1) are defined as follows:
There is a random variable for each entity–property pair for functional properties and a random variable for each entity–relation pair for functional relations. Recall that $p$ is functional if there is a unique object for each subject; i.e., if $(x,p,{y}_{1})$ and $(x,p,{y}_{2})$ then ${y}_{1}={y}_{2}$. The range of the property is the domain of the random variable. For example, height in centimeters (at age 20) of Christine Sinclair (Q262802 in Wikidata) is a real-valued random variable if property height is functional. Month-of-birth is a categorical random variable for each person. Birth-mother of a particular person is a random variable with people as the domain.
For a functional relation (the object of each triple is an entity), such as birth-mother, a prediction is a probability distribution over entities. Such a probability distribution cannot be defined independently of the population; instead, the probability over entities needs to be learned for the particular population or a function of the embeddings of the entity and relationships needs to be defined.
For non-functional properties, there is a Boolean random variable for each subject–property–value or subject–relation–object triple. For example, participated-in is a non-functional relation, and there is a Boolean random variable for triples such as Q262802 participated-in Q181278 (whether Christine Sinclair participated in the 2020 Summer Olympics).
For more general relationships $r({X}_{1},\mathrm{\dots},{X}_{k})$:
If one argument, say ${X}_{k}$, is a function of the other arguments, there is a random variable for each tuple $r({e}_{1},\mathrm{\dots},{e}_{k-1})$ where the domain of the random variable is the set of values that ${X}_{k}$ can take. For example, the relation $rated(U,M,R)$, which means that $R$ was the rating (from 1 to 5) given by user $U$ to movie $M$, gives a random variable for each user–movie pair with domain the set of possible ratings, namely $\{1,2,3,4,5\}$. Predicting the rating for a particular user and movie can be seen as a regression task, where the prediction is a real number, or a classification task, where the prediction is a probability distribution over the numbers from 1 to 5.
Otherwise, there is a Boolean random variable for each tuple $r({e}_{1},\mathrm{\dots},{e}_{k})$.
In the description below, the functional case is treated as a relation of $k-1$ arguments, with a non-Boolean prediction.
A relation of $k$ arguments gives ${n}^{k}$ random variables, where $n$ is the number of entities. This might be fewer depending on the domains of the arguments; for example, if the first argument is users and the second argument is movies, the number of random variables is the number of users times the number of movies. If arbitrary interdependence among the random variables is allowed, there are ${2}^{{n}^{k}}-1$ probabilities to be assigned, just for a single Boolean relation with $k$ arguments. How to avoid this combinatorial explosion of the number of random variables is the subject of lifted inference.