### 6.1.3 Conditional Probability

Typically, we do not only want to know the prior probability of some proposition, but we want to know how this belief is updated when an agent observes new evidence.

The measure of belief in proposition *h* based on proposition *e* is called
the **conditional
probability**
of *h* **given** *e*, written *P(h|e)*.

A formula *e* representing the conjunction of *all* of the agent's
**observations** of the world is called
**evidence**. Given evidence *e*, the
conditional probability *P(h|e)* is the agent's **posterior probability**
of *h*. The probability *P(h)* is the **prior
probability** of *h*
and is the same as *P(h|true)* because it is the probability before the
agent has observed anything.

The posterior probability involves
conditioning on *everything* the agent knows about a particular
situation. All evidence must be conditioned on to obtain the correct
posterior probability.

**Example 6.3:**For the diagnostic assistant, the patient's symptoms will be the evidence. The prior probability distribution over possible diseases is used before the diagnostic agent finds out about the particular patient. The posterior probability is the probability the agent will use after it has gained some evidence. When the agent acquires new evidence through discussions with the patient, observing symptoms, or the results of lab tests, it must update its posterior probability to reflect the new evidence. The new evidence is the conjunction of the old evidence and the new observations.

**Example 6.4:**The information that the delivery robot receives from its sensors is its evidence. When sensors are noisy, the evidence is what is known, such as the particular pattern received by the sensor, not that there is a person in front of the robot. The robot could be mistaken about what is in the world but it knows what information it received.

**Other Possible Measures of Belief**

Justifying other measures of belief is problematic. Consider, for example,
the proposal that the belief in *α∧β* is some function of the
belief in *α* and the belief in *β*. Such a measure of belief is
called **compositional**. To see why this is not sensible, consider the single toss of
a fair coin. Compare the case where *α _{1}* is "the coin will land
heads" and

*β*is "the coin will land tails" with the case where

_{1}*α*is "the coin will land heads" and

_{2}*β*is "the coin will land heads." For these two cases, the belief in

_{2}*α*would seem to be the same as the belief in

_{1}*α*, and the belief in

_{2}*β*would be the same as the belief in

_{1}*β*. But the belief in

_{2}*α*, which is impossible, is very different from the belief in

_{1}∧β_{1}*α*, which is the same as

_{2}∧β_{2}*α*.

_{2}The conditional probability *P(f|e)* is very different
from the probability of the implication *P(e →f)*. The
latter is the same as *P(¬e ∨ f)*, which is the measure of the
interpretations for which *f* is true or *e* is false. For example,
suppose you have a domain where birds are relatively rare, and
non-flying birds are a small proportion of the birds. Here *P(¬flies | bird)* would be the proportion of birds that do not fly, which
would be low. *P(bird →¬flies)* is the same as *P(¬bird ∨ ¬flies)*, which would be dominated by non-birds and so
would be high. Similarly, *P(bird →flies)* would also be
high, the probability also being dominated by the non-birds. It is difficult to
imagine a situation where the probability of an implication is
the kind of knowledge that is appropriate or useful.