A partial-order planning algorithm is significantly different from a total-order planner.
The way a partial-order plan works enables it to take advantage of problem decomposition and work on each subproblem separately.
It works on several subgoals independently, solves them with several subplans, and then combines the plan.
A partial-order planner also follows the least commitment strategy, where it delays making choices for as long as possible.
Variables are not bound unless it is absolutely necessary and new actions are chosen only if the existing actions cannot fulfil the required precondition.
Any planning algorithm that can place two actions into a plan without specifying which comes first is called a partial-order planner.
A partial-order planner searches through the space of plans rather than the space of states, which makes it perform better for certain problems.
Let's have a look at the PartialOrderPlanner
class.
In [3]:
from planning import *
from notebook import psource
In [4]:
psource(PartialOrderPlanner)
We will first describe the data-structures and helper methods used, followed by the algorithm used to find a partial-order plan.
Each plan has the following four components:
actions
: a set of actions that make up the steps of the plan.
actions
is always a subset of pddl.actions
the set of possible actions for the given planning problem.
The start
and finish
actions are dummy actions defined to bring uniformity to the problem. The start
action has no preconditions and its effects constitute the initial state of the planning problem.
The finish
action has no effects and its preconditions constitute the goal state of the planning problem.
The empty plan consists of just these two dummy actions.constraints
: a set of temporal constraints that define the order of performing the actions relative to each other.
constraints
does not define a linear ordering, rather it usually represents a directed graph which is also acyclic if the plan is consistent.
Each ordering is of the form A < B, which reads as "A before B" and means that action A must be executed sometime before action B, but not necessarily immediately before.
constraints
stores these as a set of tuples (Action(A), Action(B))
which is interpreted as given above.
A constraint cannot be added to constraints
if it breaks the acyclicity of the existing graph.causal_links
: a set of causal-links.
A causal link between two actions A and B in the plan is written as A --p--> B and is read as "A achieves p for B".
This imples that p is an effect of A and a precondition of B.
It also asserts that p must remain true from the time of action A to the time of action B.
Any violation of this rule is called a threat and must be resolved immediately by adding suitable ordering constraints.
causal_links
stores this information as tuples (Action(A), precondition(p), Action(B))
which is interpreted as given above.
Causal-links can also be called protection-intervals, because the link A --p--> B protects p from being negated over the interval from A to B.agenda
: a set of open-preconditions.
A precondition is open if it is not achieved by some action in the plan.
Planners will work to reduce the set of open preconditions to the empty set, without introducing a contradiction.
agenda
stored this information as tuples (precondition(p), Action(A))
where p is a precondition of the action A.A consistent plan is a plan in which there are no cycles in the ordering constraints and no conflicts with the causal-links.
A consistent plan with no open preconditions is a solution.
Let's briefly glance over the helper functions before going into the actual algorithm.
expand_actions
: generates all possible actions with variable bindings for use as a heuristic of selection of an open precondition.
find_open_precondition
: finds a precondition from the agenda with the least number of actions that fulfil that precondition.
This heuristic helps form mandatory ordering constraints and causal-links to further simplify the problem and reduce the probability of encountering a threat.
find_action_for_precondition
: finds an action that fulfils the given precondition along with the absolutely necessary variable bindings in accordance with the principle of least commitment.
In case of multiple possible actions, the action with the least number of effects is chosen to minimize the chances of encountering a threat.
cyclic
: checks if a directed graph is cyclic.
add_const
: adds constraint
to constraints
if the newly formed graph is acyclic and returns constraints
otherwise.
is_a_threat
: checks if the given effect
negates the given precondition
.
protect
: checks if the given action
poses a threat to the given causal_link
.
If so, the threat is resolved by either promotion or demotion, whichever generates acyclic temporal constraints.
If neither promotion or demotion work, the chosen action is not the correct fit or the planning problem cannot be solved altogether.
convert
: converts a graph from a list of edges to an Action
: set
mapping, for use in topological sorting.
toposort
: a generator function that generates a topological ordering of a given graph as a list of sets.
Each set contains an action or several actions.
If a set has more that one action in it, it means that permutations between those actions also produce a valid plan.
display_plan
: displays the causal_links
, constraints
and the partial order plan generated from toposort
.
The execute
method executes the algorithm, which is summarized below:
A partial-order plan can be used to generate different valid total-order plans.
This step is called linearization of the partial-order plan.
All possible linearizations of a partial-order plan for socks_and_shoes
looks like this.
Linearization can be carried out in many ways, but the most efficient way is to represent the set of temporal constraints as a directed graph.
We can easily realize that the graph should also be acyclic as cycles in constraints means that the constraints are inconsistent.
This acyclicity is enforced by the add_const
method, which adds a new constraint only if the acyclicity of the existing graph is not violated.
The protect
method also checks for acyclicity of the newly-added temporal constraints to make a decision between promotion and demotion in case of a threat.
This property of a graph created from the temporal constraints of a valid partial-order plan allows us to use topological sort to order the constraints linearly.
A topological sort may produce several different valid solutions for a given directed acyclic graph.
Now that we know how PartialOrderPlanner
works, let's solve a few problems using it.
In [5]:
st = spare_tire()
pop = PartialOrderPlanner(st)
pop.execute()
We observe that in the given partial order plan, Remove(Flat, Axle) and Remove(Spare, Trunk) are in the same set. This means that the order of performing these actions does not affect the final outcome. That aside, we also see that the PutOn(Spare, Axle) action has to be performed after both the Remove actions are complete, which seems logically consistent.
In [6]:
sbw = simple_blocks_world()
pop = PartialOrderPlanner(sbw)
pop.execute()
We see that this plan does not have flexibility in selecting actions, ie, actions should be performed in this order and this order only, to successfully reach the goal state.
In [7]:
ss = socks_and_shoes()
pop = PartialOrderPlanner(ss)
pop.execute()
This plan again doesn't have constraints in selecting socks or shoes.
As long as both socks are worn before both shoes, we are fine.
Notice however, there is one valid solution,
LeftSock -> LeftShoe -> RightSock -> RightShoe
that the algorithm could not find as it cannot be represented as a general partially-ordered plan but is a specific total-order solution.
In [8]:
ss = socks_and_shoes()
In [9]:
%%timeit
GraphPlan(ss).execute()
In [10]:
%%timeit
Linearize(ss).execute()
In [11]:
%%timeit
PartialOrderPlanner(ss).execute(display=False)
We observe that GraphPlan
is about 4 times faster than Linearize
because Linearize
essentially runs a GraphPlan
subroutine under the hood and then carries out some transformations on the solved planning-graph.
We also find that GraphPlan
is slightly faster than PartialOrderPlanner
, but this is mainly due to the expand_actions
method in PartialOrderPlanner
that slows it down as it generates all possible permutations of actions and variable bindings.
Without heuristic functions, PartialOrderPlanner
will be atleast as fast as GraphPlan
, if not faster, but will have a higher tendency to encounter threats and conflicts which might take additional time to resolve.
Different planning algorithms work differently for different problems.