foundations of computational agents
The axioms of probability are very weak and provide few constraints on allowable conditional probabilities. For example, if there are binary variables, there are free parameters, which means there are numbers to be assigned to give an arbitrary probability distribution.
A useful way to limit the amount of information required is to assume that each variable only directly depends on a few other variables. This uses assumptions of conditional independence. Not only does it reduce how many numbers are required to specify a model, but also the independence structure may be exploited for efficient reasoning.
As long as the value of is not or , the value of does not constrain the value of . This latter probability could have any value in the range . It is when implies , and it is if implies . A common kind of qualitative knowledge is of the form , which specifies is irrelevant to the probability of given that is observed. This idea applies to random variables, as in the following definition.
Random variable is conditionally independent of random variable given a set of random variables if
whenever the probabilities are well defined. That is, given a value of each variable in , knowing ’s value does not affect the belief in the value of .
Consider a probabilistic model of students and exams. It is reasonable to assume that the random variable is independent of , given no observations. If you find that a student works hard, it does not tell you anything about their intelligence.
The answers to the exam (the variable ) would depend on whether the student is intelligent and works hard. Thus, given , would be dependent on ; if you found someone had insightful answers, and did not work hard, your belief that they are intelligent would go up.
The grade on the exam (variable ) should depend on the student’s answers, not on the intelligence or whether the student worked hard. Thus, would be independent of given . However, if the answers were not observed, will affect (because highly intelligent students would be expected to have different answers than not so intelligent students); thus, is dependent on given no observations.
The following four statements are equivalent, as long as the conditional probabilities are well defined:
is conditionally independent of given .
is conditionally independent of given .
for all values , , , and . That is, in the context that you are given a value for , changing the value of does not affect the belief in .
.
The proof is left as an exercise. See Exercise 9.1.
Variables and are unconditionally independent if , that is, if they are conditionally independent given no observations. Note that and being unconditionally independent does not imply they are conditionally independent given some other information .
Conditional independence is a useful assumption that is often natural to assess and can be exploited in inference. It is rare to have a table of probabilities of worlds and assess independence numerically.
Another useful concept is context-specific independence. Variables and are independent with respect to context if
whenever the probabilities are well defined. That is, for all and for all , if :
This is like conditional independence, but is only for one of the values of . This is discussed in more detail when representing conditional probabilities in terms of decision trees.