foundations of computational agents
Dimensions: flat, features, indefinite horizon, fully observable, deterministic, goal directed, non-learning, single agent, offline, perfect rationality
The forward and regression planners enforce a total ordering on actions at all stages of the planning process. The CSP planner commits to the particular time that the action will be carried out. This means that those planners have to commit to an ordering of actions when adding them to a partial plan, even if there is no particular reason to put one action before another.
A partial-order planner maintains a partial ordering between actions and only commits to an ordering between actions when forced. This is sometimes also called a non-linear planner, which is a misnomer because such planners often produce a linear plan.
Because the same action may be used a number of times in the same plan, for example, the robot may need to move clockwise a number of times, the partial ordering will be between action instances, where an action instance is just a pair of an action and an integer, which we will write as . By the preconditions and effects of the action instance, we mean the precondition and the effects of the action.
A partial ordering is a binary relation that is transitive and asymmetric. A partial-order plan is a set of action instances together with a partial ordering between them, representing a “before” relation on action instances. Write if action instance is before action instance in the partial order. This means that the action of must occur before the action of . The aim of the planner is to produce a partial ordering of the action instances so that any total ordering consistent with the partial ordering will solve the goal from the initial state.
There are two special action instances, , that achieves the relations that are true in the initial state, and , whose precondition is the goal to be solved. Every other action instance is after and before in the partial ordering. The use of these as action instances means that the algorithm does not require special cases for the initial situation and for the goals. When the preconditions of are achieved, the goal is solved.
Any action instance, other than or , will be in a partial-order plan to achieve a precondition of an action instance in the plan. Each precondition of an action instance in the plan is either true in the initial state, and so achieved by , or there will be an action instance in the plan that achieves . The action instance that achieves must be before ; that is, . To be correct, the algorithm must also ensure that nothing makes false in between and .
A causal link is a triple , where and are action instances and is a assignment that is in the precondition of , and in the effect of . This means that makes hold for . With this causal link, any other action instance that makes false must either be before or after .
Informally, a partial-order planner works as follows. Begin with the action instances and and the partial order . The planner maintains an agenda that is a set of pairs, where is an action instance in the plan and is a variable-value assignment that is a precondition of that remains to be achieved. Initially, the agenda contains pairs , where is an assignment that must be true in the goal state.
At each stage in the planning process, a pair is chosen from the agenda, where is in the precondition for action instance . Then an action instance, , is chosen to achieve . That action instance is either already in the plan – it could be the action, for example – or it is a new action instance that is added to the plan. Action instance must happen before in the partial order. The planner adds a causal link that records that achieves for action . Any action in the plan that makes false must happen either before or after . If is a new action, its preconditions are added to the agenda, and the process continues until the agenda is empty.
The algorithm is given in Figure 6.6. This is a non-deterministic procedure. The “choose” and the “either …or …” form choices that must be searched over. There are two choices that require search:
which action is chosen to achieve and
whether an action instance that deletes happens before or after .
The function returns the constraints formed by adding the constraint to , and it fails if is incompatible with . There are many ways this function can be implemented. See Exercise 12.
The function checks whether , and the effect of is inconsistent with . If so, either is added to the set of constraints or is added to the set of constraints. This is a non-deterministic choice that is searched over.
Consider the goal of Sam not wanting coffee and no mail waiting (i.e., ), where in the initial state Rob is in the lab, Sam wants coffee, Rob does not have coffee, there is mail waiting and Rob does not have mail, i.e., , , , , .
We will write instances of action as where is a unique integer.
Initially the agenda is
Suppose is chosen and removed from the agenda. One action can achieve , namely deliver coffee, , with preconditions and . So it inserts an instance, say , into the plan. After the first time through the repeat loop, contains
At this stage, the value of is . There is one causal link, . This causal link means that no action that undoes is allowed to happen between and .
Suppose is chosen from the agenda. One action can achieve this, , with precondition . The algorithm constructs a new action instance, say . The causal link is added to the set of causal links; and are added to the agenda.
Suppose is chosen from the agenda. The action achieves , because is true initially. The causal link is added to the set of causal links. Nothing is added to the agenda.
At this stage, there is no ordering imposed between and .
Suppose is removed from the agenda. There are two actions that can achieve : with preconditions , and with preconditions . The algorithm searches over these choices. Suppose it chooses the action instance . The causal link is added.
The first violation of a causal link occurs when a move action is used to achieve . This action violates the causal link , and so must happen after (the robot goes to the mail room after delivering coffee) or before .
Eventually, it finds a plan of action instances, such as:
This is the only total ordering consistent with the partial ordering.
A partial order planner works particularly well when no ordering of actions can achieve a goal, as it does not need to search over all permutations of the actions. It also works well when many orderings can solve the goal, in which case it can find a flexible plan for the robot.