Full text of the second edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2017 is now available.

#### 6.4.2.3 From Samples to Probabilities

Probabilities can be estimated from a set of examples using the sample
average. The **sample average** of a proposition *α* is the number
of samples where *α* is true divided by the total number of samples.
The sample average approaches the true probability as the number of
samples approaches infinity by the law of large numbers.

**Hoeffding's inequality** provides an
estimate of the error of an unconditional probability given *n* samples:

**Proposition.**

*[Hoeffding] Suppose*

*p*is the true probability, and*s*is the sample average from*n*independent samples; thenP(|s-p|>ε) ≤ 2e^{-2nε2}.

This theorem can be used to determine how many samples are required to
guarantee a **probably approximately correct** estimate of the
probability. To
guarantee that the error is less than some *ε<0.5*,
infinitely many samples are required. However, if you are willing to have an error
greater than *ε*
in *δ* of the cases, you can solve *2e ^{-2nε2}<δ*
for

*n*, which gives

n>-ln(δ/2)/(2ε^{2}).

For example, suppose you want an error less than
*0.1*, nineteen times out of twenty; that is,
you are only willing to tolerate an error bigger than 0.1, in 5% of
the cases. You can use Hoeffding's bound by setting *ε* to
*0.1* and *δ=0.05*, which gives *n> 184*. Thus, you can guarantee
such bounds on the error with
185 samples. If you want an error of less than *0.01*
in at least *95%* of the cases, 18,445
samples can be used. If you want an error of less than *0.1* in *99%* of the
cases, *265* samples can be used.