# A.4 Relations and the Relational Algebra

Relations are common in AI and database systems. The relational algebra defines operations on relations and is the basis of relational databases.

A scope $S$ is a set of variables. A tuple $t$ on scope $S$ has a value on each variable in its scope. A variable can be seen as a function on tuples; one that returns the value for that variable for that tuple. We write $X(t)$ to be the value of tuple $t$ on variable $X$. The value of ${X}({t})$ must be in $dom(X)$. This is like the mathematical notion of tuple, except the index is given by a variable, not by an integer.

A relation is a set of tuples, all with the same scope. A relation is often given a name. The scope of the tuples is often called the relation scheme. A relational database is a set of relations. A scheme of a relational database is the set of pairs of relation names and relation schemes.

A relation with scope $X_{1},\dots,X_{n}$ can be seen as a Boolean factor on $X_{1},\dots,X_{n}$, where the true elements are represented as tuples.

Often a relation is written as a table.

###### Example A.2.

The following is a tabular depiction of a relation, $enrolled$:

$Course$ $Year$ $Student$ $Grade$
cs322 2008 fran 77
cs111 2009 billie 88
cs111 2009 jess 78
cs444 2008 fran 83
cs322 2009 jordan 92

The heading gives the scheme, namely $\{Course,Year,Student,Grade\}$, and every other row is a tuple. The first tuple, call it $t_{1}$, is defined by ${Course}({t_{1}})=\mbox{cs322}$, ${Year}({t_{1}})=2008$, ${Student}({t_{1}})=\mbox{fran}$, ${Grade}({t_{1}})=77$.

The order of the columns and the order of the rows is not significant.

If $r$ is a relation with scheme $S$, and $c$ is a condition on the variables in $S$, the selection of $c$ in $r$, written $\sigma_{c}(r)$, is the set of tuples in $r$ for which $c$ holds. The selection has the same scheme as $r$.

If $r$ is a relation with scheme $S$, and $S_{0}\subseteq S$, the projection of $r$ onto $S_{0}$, written $\pi_{S_{0}}(r)$, is the set of tuples of $r$ where the scope is restricted to $S_{0}$.

###### Example A.3.

Suppose $enrolled$ is the relation given in Example A.2.

The relation $\sigma_{Grade>79}(enrolled)$ selects those tuples in $enrolled$ where the grade is over 79. This is the relation:

$Course$ $Year$ $Student$ $Grade$
cs111 2009 billie 88
cs444 2008 fran 83
cs322 2009 jordan 92

The relation $\pi_{\{Student,Year\}}(enrolled)$ specifies what years students were enrolled:

$Student$ $Year$
fran 2008
billie 2009
jess 2009
jordan 2009

Notice how the first and the fourth tuple of $enrolled$ become the same tuple in the projection; they represent the same function on $\{Student,Year\}$.

If two relations are on the same scheme, the union, intersection, and set difference of these are defined as the corresponding operations on the set of tuples.

If $r_{1}$ and $r_{2}$ are two relations, the natural join of $r_{1}$ and $r_{2}$, written $r_{1}\bowtie r_{2}$, is a relation where

• the scheme of the join is the union of the scheme of $r_{1}$ and the scheme of $r_{2}$

• a tuple is in the join if the tuple restricted to the scope of $r_{1}$ is in the relation $r_{1}$ and the tuple restricted to the scope of $r_{2}$ is in the relation $r_{2}$.

###### Example A.4.

Consider the relation $assisted$:

$Course$ $Year$ $TA$
cs322 2008 yuki
cs111 2009 sam
cs111 2009 chris
cs322 2009 yuki

The join of $enrolled$ and $assisted$, written $enrolled\bowtie assisted$, is the relation

$Course$ $Year$ $Student$ $Grade$ $TA$
cs322 2008 fran 77 yuki
cs111 2009 billie 88 sam
cs111 2009 jess 78 sam
cs111 2009 billie 88 chris
cs111 2009 jess 78 chris
cs322 2009 jordan 92 yuki

Note how in the join, the information about $cs444$ was lost, as there was no TA in that course.