6 Planning with Certainty

6.3 Regression Planning

Dimensions: flat, features, indefinite horizon, fully observable, deterministic, goal directed, non-learning, single agent, offline, perfect rationality

It is often more efficient to search in a different search space – one where the nodes are not states but rather are subgoals to be achieved. A subgoal is an assignment to some of the features.

Regression planning is searching in the graph defined by the following:

  • The nodes are subgoals.

  • The arcs correspond to actions. In particular, an arc from node g to g, labeled with action act, means

    • act is the last action that is carried out before subgoal g is achieved, and

    • node g is a subgoal that must be true immediately before act so that g is true immediately after act.

  • The start node is the planning goal to be achieved.

  • The goal condition for the search, goal(g), is true if g is true of the initial state.

Given a node that represents subgoal g, a neighbor of g exists for every action act such that

  • act is possible, it is possible for act to be carried out and for g to be true immediately after act and

  • act is useful, act achieves part of g.

Consider the subgoal g={X1=v1,,Xn=vn}. In terms of the STRIPS representation, act is useful for solving g if, for some i, Xi=vi is an effect of action act. Immediately before act, the preconditions of act, as well as any Xk=vk not achieved by act, must hold. Thus the neighbor of subgoal g on the arc labeled with act is the subgoal precondition(act)(geffects(act)), as long as act is possible. Action act is possible if:

  • for each Xj=vj in g, there is no effect Xj=vj of act where vjvj and

  • precondition(act)(geffects(act)) does not include two different assignments to any feature.

Example 6.11.

Suppose the goal is to achieve ¬swc. Therefore, the start node is {¬swc}. If this is true in the initial state, the planner stops. If not, it chooses an action that achieves ¬swc. In this case, there is only one such action: dc. The precondition of dc is {off,rhc}. Thus, there is one arc:

{¬swc},{off,rhc} labeled with dc.

Consider the node {off,rhc}. There are two actions that can achieve off, namely mc from cs and mcc from lab. There is one action that can achieve rhc, namely puc. However, puc has as precondition {cs,¬rhc}, but cs and off are inconsistent (because they involve different assignments to the variable RLoc). Therefore, puc is not a possible last action; it is not possible that, immediately after puc, the condition {off,rhc} holds.

Figure 6.3: Part of the search space for a regression planner

Figure 6.3 shows the first three levels of the search space (without cycle pruning or multiple-path pruning). Note that the search space is the same no matter what the initial state is. The starting state has two roles: to serve as a stopping criterion and as a source of heuristics.

If a subgoal contains a number of assignments, regression can often determine which of the assignments to achieve last, as in the following example.

Example 6.12.

Suppose the goal is for Sam to not want coffee and for the robot to have coffee: {¬swc,rhc}. The last action cannot be dc to achieve ¬swc, because this achieves ¬rhc. The only last action must be puc to achieve rhc. Thus, the resulting subgoal is {¬swc,cs}. Again, the last action before this subgoal cannot be to achieve ¬swc because this has as a precondition off, which is inconsistent with cs. Therefore, the second-to-last action must be a move action to achieve cs.

In terms of the feature-based representation of actions, an action act is useful if there is a causal rule that achieves Xi=vi for some i, using action act. The neighbor of this node along the arc labeled with action act is the proposition


where body(Xi=vi,act) is the set of assignments of variables in the body of a rule that specifies when Xi=vi is true immediately after act. There is no such neighbor if there is no corresponding rule for some i, or if the proposition is inconsistent (i.e., assigns different values to a variable). Note that, if multiple rules are applicable for the same action, there will be multiple neighbors.

Search algorithms such as A* and branch-and-bound can make use of heuristic knowledge. For a regression planner, the heuristic value of a node is an estimate of the cost to achieve the subgoal represented by the node from the initial state. This form of heuristic as an estimate of the cost of achieving a subgoal from a state is the same as used in a forward planner. So, for example, the heuristic of Example 6.10 could also be used for a regression planner. However, an effective heuristic for a regression planner may not be very useful for a forward planner, and vice versa (see Exercise 4).

One problem that arises in regression planning is that a subgoal may not be achievable. Deciding whether a subgoal is achievable is often difficult to infer from the definitions of the actions. For example, consider the restriction that an object cannot be at two different places at the same time; sometimes this is not explicitly represented and is only implicit in the effects of an action, and the fact that the object is only in one position initially. It is possible to have domain knowledge to prune nodes that can be shown to be inconsistent.

Cycle pruning and multiple-path pruning may be incorporated into a regression planner. The regression planner does not have to visit exactly the same node to prune the search. If the subgoal represented by a node n is a superset of a subgoal on the path to n, node n can be pruned. Similarly, for multiple-path pruning, see Exercise 10.

A regression planner commits to a total ordering of actions, even if no particular reason exists for one ordering over another. This commitment to a total ordering tends to increase the complexity of the search space if the actions do not interact much or at all. For example, a regression planner would test each permutation of a sequence of actions when it may be possible to show that no ordering can succeed.