Artificial Intelligence - foundations of computational agents -- 8.5 Partial-Order Planning

Third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including the full text).

8.5 Partial-Order Planning

The forward and regression planners enforce a total ordering on actions at all stages of the planning process. The CSP planner commits to the particular time that the action will be carried out. This means that those planners have to commit to an ordering of actions that cannot occur concurrently when adding them to a partial plan, even if there is no particular reason to put one action before another.

The idea of a partial-order planner is to have a partial ordering between actions and only commit to an ordering between actions when forced. This is sometimes also called a non-linear planner, which is a misnomer because such planners often produce a linear plan.

A partial ordering is a less-than relation that is transitive and asymmetric. A partial-order plan is a set of actions together with a partial ordering, representing a "before" relation on actions, such that any total ordering of the actions, consistent with the partial ordering, will solve the goal from the initial state. Write act₀ < act₁ if action act₀ is before action act₁ in the partial order. This means that action act₀ must occur before action act₁.

For uniformity, treat start as an action that achieves the relations that are true in the initial state, and treat finish as an action whose precondition is the goal to be solved. The pseudoaction start is before every other action, and finish is after every other action. The use of these as actions means that the algorithm does not require special cases for the initial situation and for the goals. When the preconditions of finish hold, the goal is solved.

An action, other than start or finish, will be in a partial-order plan to achieve a precondition of an action in the plan. Each precondition of an action in the plan is either true in the initial state, and so achieved by start, or there will be an action in the plan that achieves it.

We must ensure that the actions achieve the conditions they were assigned to achieve. Each precondition P of an action act₁ in a plan will have an action act₀ associated with it such that act₀ achieves precondition P for act₁. The triple ⟨act₀,P,act₁⟩ is a causal link. The partial order specifies that action act₀ occurs before action act₁, which is written as act₀ < act₁. Any other action A that makes P false must either be before act₀ or after act₁.

Informally, a partial-order planner works as follows: Begin with the actions start and finish and the partial order start < finish. The planner maintains an agenda that is a set of ⟨P,A⟩ pairs, where A is an action in the plan and P is an atom that is a precondition of A that must be achieved. Initially the agenda contains pairs ⟨G,finish⟩, where G is an atom that must be true in the goal state.

At each stage in the planning process, a pair ⟨G,act₁⟩ is selected from the agenda, where P is a precondition for action act₁. Then an action, act₀, is chosen to achieve P. That action is either already in the plan - it could be the start action, for example - or it is a new action that is added to the plan. Action act₀ must happen before act₁ in the partial order. It adds a causal link that records that act₀ achieves P for action act₁. Any action in the plan that deletes P must happen either before act₀ or after act₁. If act₀ is a new action, its preconditions are added to the agenda, and the process continues until the agenda is empty.

This is a non-deterministic procedure. The "choose" and the "either ...or ..." form choices that must be searched over. There are two choices that require search:

which action is selected to achieve G and
whether an action that deletes G happens before act₀ or after act₁.

non-deterministic procedure PartialOrderPlanner(Gs)
2:           Inputs
3:                     Gs: set of atomic propositions to achieve
4:           Output
5:                     linear plan to achieve Gs
6:           Local
7:                     Agenda: set of ⟨P,A⟩ pairs where P is atom and A an action
8:                     Actions: set of actions in the current plan
9:                     Constraints: set of temporal constraints on actions
10:                     CausalLinks: set of ⟨act₀,P,act₁⟩ triples
11:           Agenda ←{⟨G,finish⟩:G ∈Gs}
12:           Actions ←{start,finish}
13:           Constraints ←{start<finish}
14:           CausalLinks ←{}
15:           repeat
16:                     select and remove ⟨G,act₁⟩ from Agenda
17:                     either
18:                               choose act₀ ∈Actions such that act₀ achieves G
19:                     or
20:                               choose act₀ ∉Actions such that act₀ achieves G
21:                               Actions ←Actions ∪{act₀}
22:                               Constraints ←add_const(start<act₀,Constraints)
23:                               for each CL∈CausalLinks do
24:                                         Constraints ←protect(CL,act₀,Constraints)
25:
26:                               Agenda ←Agenda ∪{⟨P,act₀⟩: P is a precondition of act₀ }
27:
28 :                    Constraints ←add_const(act₀<act₁,Constraints)
29:                     CausalLinks ∪ {⟨act_o,G,act₁⟩}
30:                     for each A∈Actions do
31:                               Constraints ←protect(⟨act_o,G,act₁⟩,A,Constraints)
32:
33:           until Agenda={}
34:           return total ordering of Actions consistent with Constraints

Figure 8.5: Partial-order planner

The algorithm PartialOrderPlanner is given in Figure 8.5.

The function add_const(act₀<act₁,Constraints) returns the constraints formed by adding the constraint act₀<act₁ to Constraints, and it fails if act₀<act₁ is incompatible with Constraints. There are many ways this function can be implemented. See Exercise 8.13.

The function protect(⟨act_o,G,act₁⟩,A,Constraints) checks whether A≠act₀ and A≠act₁ and A deletes G. If so, it returns either { A<act₀ } ∪ Constraints or { act₁<A } ∪ Constraints. This is a non-deterministic choice that is searched over. Otherwise it returns Constraints.

Example 8.14: Consider the goal ¬swc ∧ ¬mw, where the initial state contains RLoc=lab, swc, ¬rhc, mw, ¬rhm.

Initially the agenda is

⟨ ¬swc,finish⟩,⟨ ¬mw,finish⟩.

Suppose ⟨ ¬swc,finish⟩ is selected and removed from the agenda. One action exists that can achieve ¬swc, namely deliver coffee, dc, with preconditions off and rhc. At the end of the repeat loop, Agenda contains

⟨off,dc⟩,⟨rhc,dc⟩,⟨ ¬mw,finish⟩.

Constraints is {start<finish, start < dc, dc <finish}. There is one causal link, ⟨dc, ¬swc,finish⟩. This causal link means that no action that undoes ¬swc is allowed to happen after dc and before finish.

Suppose ⟨ ¬mw,finish⟩ is selected from the agenda. One action exists that can achieve this, pum, with preconditions mw and RLoc=mr. The causal link ⟨pum, ¬mw,finish⟩ is added to the set of causal links; ⟨mw,pum⟩ and ⟨mr,pum⟩ are added to the agenda.

Suppose ⟨mw,pum⟩ is selected from the agenda. The action start achieves mw, because mw is true initially. The causal link ⟨start,mw,pum⟩ is added to the set of causal links. Nothing is added to the agenda.

At this stage, there is no ordering imposed between dc and pum.

Suppose ⟨off,dc⟩ is removed from the agenda. There are two actions that can achieve off: mc_cs with preconditions cs, and mcc_lab with preconditions lab. The algorithm searches over these choices. Suppose it chooses mc_cs. Then the causal link ⟨mc_cs,off,dc⟩ is added.

The first violation of a causal link occurs when a move action is used to achieve ⟨mr,pum⟩. This action violates the causal link ⟨mc_cs,off,dc⟩, and so must happen after dc (the robot goes to the mail room after delivering coffee) or before mc_cs.

The preceding algorithm has glossed over one important detail. It is sometimes necessary to perform some action more than once in a plan. The preceding algorithm will not work in this case, because it will try to find a partial ordering with both instances of the action occurring at the same time. To fix this problem, the ordering should be between action instances, and not actions themselves. To implement this, assign an index to each instance of an action in the plan, and the ordering is on the action instance indexes and not the actions themselves. This is left as an exercise.