foundations of computational agents
${A}^{\mathrm{*}}$ search uses both path cost, as in lowest-cost-first, and heuristic information, as in greedy best-first search, in its selection of which path to expand. For each path on the frontier, ${A}^{*}$ uses an estimate of the total path cost from the start node to a goal node constrained to follow that path initially. It uses $\text{cost}(p)$, the cost of the path found, as well as the heuristic function $h(p)$, the estimated path cost from the end of $p$ to the goal.
For any path $p$ on the frontier, define $f(p)=\text{cost}(p)+h(p)$. This is an estimate of the total path cost to follow path $p$ then go to a goal node. If $n$ is the node at the end of path $p$, this can be depicted as:
$$\underset{f\left(p\right)}{\underset{\u23df}{\underset{\text{cost}\left(p\right)}{\underset{\u23df}{{\text{}}\text{start}\stackrel{\text{actual}}{\u27f6}n}}\underset{h\left(p\right)}{\underset{\u23df}{\stackrel{\text{estimate}}{\u27f6}\text{goal}{\text{}}}}}}$$ |
If $h(n)$ is an admissible heuristic and so never overestimates the cost from node $n$ to a goal node, then $f(p)$ does not overestimate the path cost of going from the start node to a goal node via $p$.
${A}^{*}$ is implemented using the generic search algorithm, treating the frontier as a priority queue ordered by $f(p)$.
Consider using ${{A}}^{{\mathrm{*}}}$ search in Example 3.5 using the heuristic function of Example 3.13. In this example, the paths on the frontier are shown using the final node of the path, subscripted with the ${f}$-value of the path. The frontier is initially ${\mathrm{[}}{o}{\mathit{}}{{\mathrm{103}}}_{{\mathrm{21}}}{\mathrm{]}}$, because ${h}{\mathit{}}{\mathrm{(}}{o}{\mathit{}}{\mathrm{103}}{\mathrm{)}}{\mathrm{=}}{\mathrm{21}}$ and the cost of the path is zero. It is replaced by its neighbors, forming the frontier
$${[}{b}{}{{3}}_{{21}}{,}{{\text{\mathit{t}\mathit{s}}}}_{{31}}{,}{o}{}{{109}}_{{36}}{]}{.}$$ |
The first element represents the path $$; its ${f}$-value is $$. Next ${b}{\mathit{}}{\mathrm{3}}$ is selected and replaced by its neighbors, forming the frontier
$${[}{b}{}{{1}}_{{21}}{,}{b}{}{{4}}_{{29}}{,}{{\text{\mathit{t}\mathit{s}}}}_{{31}}{,}{o}{}{{109}}_{{36}}{]}{.}$$ |
Then the path to ${b}{\mathit{}}{\mathrm{1}}$ is selected and replaced by its neighbors, forming the frontier
$${[}{c}{}{{2}}_{{21}}{,}{b}{}{{2}}_{{29}}{,}{b}{}{{4}}_{{29}}{,}{t}{}{{s}}_{{31}}{,}{o}{}{{109}}_{{36}}{]}{.}$$ |
Then the path to ${c}{\mathit{}}{\mathrm{2}}$ is selected and replaced by its neighbors, forming
$${[}{c}{}{{1}}_{{21}}{,}{b}{}{{2}}_{{29}}{,}{b}{}{{4}}_{{29}}{,}{c}{}{{3}}_{{29}}{,}{{\text{\mathit{t}\mathit{s}}}}_{{31}}{,}{o}{}{{109}}_{{36}}{]}{.}$$ |
Up to this stage, the search has been continually exploring what seems to be the direct path to the goal. Next the path to ${c}{\mathit{}}{\mathrm{1}}$ is selected and is replaced by its neighbor, forming the frontier
$${[}{b}{}{{2}}_{{29}}{,}{b}{}{{4}}_{{29}}{,}{c}{}{{3}}_{{29}}{,}{{\text{\mathit{t}\mathit{s}}}}_{{31}}{,}{c}{}{{3}}_{{35}}{,}{o}{}{{109}}_{{36}}{]}{.}$$ |
At this stage, there are two paths to the node ${c}{\mathit{}}{\mathrm{3}}$ on the frontier. The path to ${c}{\mathit{}}{\mathrm{3}}$ that does not go through ${c}{\mathit{}}{\mathrm{1}}$ has a lower ${f}$-value than the one that does. Later, we consider how to prune one of these paths without giving up optimality.
There are three paths with the same ${f}$-value. The algorithm does not specify which is selected. Suppose it selects the path to the node with the smallest heuristic value (see Exercise 6), which is the path to ${c}{\mathit{}}{\mathrm{3}}$. This node is removed from the frontier and has no neighbors so the resulting frontier is
$${[}{b}{}{{2}}_{{29}}{,}{b}{}{{4}}_{{29}}{,}{{\text{\mathit{t}\mathit{s}}}}_{{31}}{,}{c}{}{{3}}_{{35}}{,}{o}{}{{109}}_{{36}}{]}{.}$$ |
Next ${b}{\mathit{}}{\mathrm{2}}$ is selected resulting in the frontier
$${[}{b}{}{{4}}_{{29}}{,}{{\text{\mathit{t}\mathit{s}}}}_{{31}}{,}{c}{}{{3}}_{{35}}{,}{b}{}{{4}}_{{35}}{,}{o}{}{{109}}_{{36}}{]}{.}$$ |
The first path to ${b}{\mathit{}}{\mathrm{4}}$ is selected next and is replaced by its neighbors, forming
$${[}{{\text{\mathit{t}\mathit{s}}}}_{{31}}{,}{c}{}{{3}}_{{35}}{,}{b}{}{{4}}_{{35}}{,}{o}{}{{109}}_{{36}}{,}{o}{}{{109}}_{{42}}{]}{.}$$ |
Note how ${{A}}^{{\mathrm{*}}}$ pursues many different paths from the start.
A lowest-cost path to the goal is eventually found. The algorithm is forced to try many different paths, because several of them temporarily seemed to have the lowest cost. It still does better than either lowest-cost-first search or greedy best-first search.
Consider Figure 3.9, which was a problematic graph for the other heuristic methods. Although it initially searches down from ${s}$ because of the heuristic function, eventually the cost of the path becomes so large that it picks the node on an actual optimal path.
A search algorithm is admissible if, whenever a solution exists, it returns an optimal solution. To guarantee admissibility, some conditions on the graph and the heuristic must hold. The following theorem gives sufficient conditions for ${A}^{*}$ to be admissible.
(${A}^{\mathrm{*}}$ admissibility) If there is a solution, ${A}^{\mathrm{*}}$ using heuristic function $h$ always returns an optimal solution, if
the branching factor is finite (each node has a bounded number of neighbors),
all arc costs are greater than some $\u03f5>0$, and
$h$ is an admissible heuristic, which means that $h(n)$ is less than or equal to the actual cost of the lowest-cost path from node $n$ to a goal node.
Part A: A solution will be found. If the arc costs are all greater than some $\u03f5>0$, we say the costs are bounded above zero. If this holds and with a finite branching factor, eventually, for all paths $p$ in the frontier, $\text{cost}(p)$ will exceed any finite number and, thus, will exceed a solution cost if one exists (with each path having no greater than $c/\u03f5$ arcs, where $c$ is the cost of an optimal solution). Because the branching factor is finite, only a finite number of paths must be expanded before the search could get to this point, but the ${A}^{*}$ search would have found a solution by then. Bounding the arc costs above zero is a sufficient condition for ${A}^{*}$ to avoid suffering from Zeno’s paradox, as described for lowest-cost-first search.
Part B: The first path to a goal selected is an optimal path. $h$ is admissible implies the $f$-value of a node on an optimal solution path is less than or equal to the cost of an optimal solution, which, by the definition of optimal, is less than the cost for any non-optimal solution. The $f$-value of a solution is equal to the cost of the solution if the heuristic is admissible. Because an element with minimum $f$-value is chosen at each step, a non-optimal solution can never be chosen while there is a path on the frontier that leads to an optimal solution. So, before it can select a non-optimal solution, ${A}^{*}$ will have to pick all of the nodes on an optimal path, including an optimal solution. ∎
It should be noted that the admissibility of ${A}^{*}$ does not ensure that every intermediate node selected from the frontier is on an optimal path from the start node to a goal node. Admissibility ensures that the first solution found will be optimal even in graphs with cycles. It does not ensure that the algorithm will not change its mind about which partial path is the best while it is searching.
To see how the heuristic function improves the efficiency of ${A}^{*}$, suppose $c$ is the cost of a least-cost path from the start node to a goal node. ${A}^{*}$, with an admissible heuristic, expands all paths from the start node in the set (whose initial parts are also in the set):
$$ |
and some of the paths in the set
$$\{p:\text{cost}(p)+h(p)=c\}.$$ |
Increasing $h$ while keeping it admissible, affects the efficiency of ${A}^{*}$ if it reduces the size of the first of these sets. If the second set is large, there can be a great variability in the space and time of ${A}^{*}$. The space and time can be sensitive to the tie-breaking mechanism for selecting a path from those with the same $f$-value. It could, for example, select a path with minimal $h$-value or use a first-in last-out protocol (i.e., the same as a depth-first search) for these paths; see Exercise 6.
Iterative deepening can also be applied to an ${A}^{*}$ search. Iterative Deepening ${A}^{\mathrm{*}}$ (${\text{IDA}}^{*}$) performs repeated depth-bounded depth-first searches. Instead of the bound being on the number of arcs in the path, it is a bound on the value of $f(n)$. The threshold is initially the value of $f(s)$, where $s$ is the start node. ${\text{IDA}}^{*}$ then carries out a depth-first depth-bounded search but never expands a path with a higher $f$-value than the current bound. If the depth-bounded search fails and the bound was reached, the next bound is the minimum of the $f$-values that exceeded the previous bound. ${\text{IDA}}^{*}$ thus checks the same nodes as ${A}^{*}$, perhaps breaking ties differently, but recomputes them with a depth-first search instead of storing them.
If all that is required is an approximately optimal path, for example within $\delta $ of optimal, the bound can be $\delta $ plus the minimum of the $f$-values that exceeded the previous bound. This can make the search much more efficient in cases where the path lengths can be very close to each other.