PARTIAL ORDER PLANNER

A partial-order planning algorithm is significantly different from a total-order planner. The way a partial-order plan works enables it to take advantage of problem decomposition and work on each subproblem separately. It works on several subgoals independently, solves them with several subplans, and then combines the plan.
A partial-order planner also follows the least commitment strategy, where it delays making choices for as long as possible. Variables are not bound unless it is absolutely necessary and new actions are chosen only if the existing actions cannot fulfil the required precondition.
Any planning algorithm that can place two actions into a plan without specifying which comes first is called a partial-order planner. A partial-order planner searches through the space of plans rather than the space of states, which makes it perform better for certain problems.

Let's have a look at the PartialOrderPlanner class.


In [3]:
from planning import *
from notebook import psource

In [4]:
psource(PartialOrderPlanner)


class PartialOrderPlanner:

    def __init__(self, planningproblem):
        self.planningproblem = planningproblem
        self.initialize()

    def initialize(self):
        """Initialize all variables"""
        self.causal_links = []
        self.start = Action('Start', [], self.planningproblem.init)
        self.finish = Action('Finish', self.planningproblem.goals, [])
        self.actions = set()
        self.actions.add(self.start)
        self.actions.add(self.finish)
        self.constraints = set()
        self.constraints.add((self.start, self.finish))
        self.agenda = set()
        for precond in self.finish.precond:
            self.agenda.add((precond, self.finish))
        self.expanded_actions = self.expand_actions()

    def expand_actions(self, name=None):
        """Generate all possible actions with variable bindings for precondition selection heuristic"""

        objects = set(arg for clause in self.planningproblem.init for arg in clause.args)
        expansions = []
        action_list = []
        if name is not None:
            for action in self.planningproblem.actions:
                if str(action.name) == name:
                    action_list.append(action)
        else:
            action_list = self.planningproblem.actions

        for action in action_list:
            for permutation in itertools.permutations(objects, len(action.args)):
                bindings = unify(Expr(action.name, *action.args), Expr(action.name, *permutation))
                if bindings is not None:
                    new_args = []
                    for arg in action.args:
                        if arg in bindings:
                            new_args.append(bindings[arg])
                        else:
                            new_args.append(arg)
                    new_expr = Expr(str(action.name), *new_args)
                    new_preconds = []
                    for precond in action.precond:
                        new_precond_args = []
                        for arg in precond.args:
                            if arg in bindings:
                                new_precond_args.append(bindings[arg])
                            else:
                                new_precond_args.append(arg)
                        new_precond = Expr(str(precond.op), *new_precond_args)
                        new_preconds.append(new_precond)
                    new_effects = []
                    for effect in action.effect:
                        new_effect_args = []
                        for arg in effect.args:
                            if arg in bindings:
                                new_effect_args.append(bindings[arg])
                            else:
                                new_effect_args.append(arg)
                        new_effect = Expr(str(effect.op), *new_effect_args)
                        new_effects.append(new_effect)
                    expansions.append(Action(new_expr, new_preconds, new_effects))

        return expansions

    def find_open_precondition(self):
        """Find open precondition with the least number of possible actions"""

        number_of_ways = dict()
        actions_for_precondition = dict()
        for element in self.agenda:
            open_precondition = element[0]
            possible_actions = list(self.actions) + self.expanded_actions
            for action in possible_actions:
                for effect in action.effect:
                    if effect == open_precondition:
                        if open_precondition in number_of_ways:
                            number_of_ways[open_precondition] += 1
                            actions_for_precondition[open_precondition].append(action)
                        else:
                            number_of_ways[open_precondition] = 1
                            actions_for_precondition[open_precondition] = [action]

        number = sorted(number_of_ways, key=number_of_ways.__getitem__)
        
        for k, v in number_of_ways.items():
            if v == 0:
                return None, None, None

        act1 = None
        for element in self.agenda:
            if element[0] == number[0]:
                act1 = element[1]
                break

        if number[0] in self.expanded_actions:
            self.expanded_actions.remove(number[0])

        return number[0], act1, actions_for_precondition[number[0]]

    def find_action_for_precondition(self, oprec):
        """Find action for a given precondition"""

        # either
        #   choose act0 E Actions such that act0 achieves G
        for action in self.actions:
            for effect in action.effect:
                if effect == oprec:
                    return action, 0

        # or
        #   choose act0 E Actions such that act0 achieves G
        for action in self.planningproblem.actions:
            for effect in action.effect:
                if effect.op == oprec.op:
                    bindings = unify(effect, oprec)
                    if bindings is None:
                        break
                    return action, bindings

    def generate_expr(self, clause, bindings):
        """Generate atomic expression from generic expression given variable bindings"""

        new_args = []
        for arg in clause.args:
            if arg in bindings:
                new_args.append(bindings[arg])
            else:
                new_args.append(arg)

        try:
            return Expr(str(clause.name), *new_args)
        except:
            return Expr(str(clause.op), *new_args)
        
    def generate_action_object(self, action, bindings):
        """Generate action object given a generic action andvariable bindings"""

        # if bindings is 0, it means the action already exists in self.actions
        if bindings == 0:
            return action

        # bindings cannot be None
        else:
            new_expr = self.generate_expr(action, bindings)
            new_preconds = []
            for precond in action.precond:
                new_precond = self.generate_expr(precond, bindings)
                new_preconds.append(new_precond)
            new_effects = []
            for effect in action.effect:
                new_effect = self.generate_expr(effect, bindings)
                new_effects.append(new_effect)
            return Action(new_expr, new_preconds, new_effects)

    def cyclic(self, graph):
        """Check cyclicity of a directed graph"""

        new_graph = dict()
        for element in graph:
            if element[0] in new_graph:
                new_graph[element[0]].append(element[1])
            else:
                new_graph[element[0]] = [element[1]]

        path = set()

        def visit(vertex):
            path.add(vertex)
            for neighbor in new_graph.get(vertex, ()):
                if neighbor in path or visit(neighbor):
                    return True
            path.remove(vertex)
            return False

        value = any(visit(v) for v in new_graph)
        return value

    def add_const(self, constraint, constraints):
        """Add the constraint to constraints if the resulting graph is acyclic"""

        if constraint[0] == self.finish or constraint[1] == self.start:
            return constraints

        new_constraints = set(constraints)
        new_constraints.add(constraint)

        if self.cyclic(new_constraints):
            return constraints
        return new_constraints

    def is_a_threat(self, precondition, effect):
        """Check if effect is a threat to precondition"""

        if (str(effect.op) == 'Not' + str(precondition.op)) or ('Not' + str(effect.op) == str(precondition.op)):
            if effect.args == precondition.args:
                return True
        return False

    def protect(self, causal_link, action, constraints):
        """Check and resolve threats by promotion or demotion"""

        threat = False
        for effect in action.effect:
            if self.is_a_threat(causal_link[1], effect):
                threat = True
                break

        if action != causal_link[0] and action != causal_link[2] and threat:
            # try promotion
            new_constraints = set(constraints)
            new_constraints.add((action, causal_link[0]))
            if not self.cyclic(new_constraints):
                constraints = self.add_const((action, causal_link[0]), constraints)
            else:
                # try demotion
                new_constraints = set(constraints)
                new_constraints.add((causal_link[2], action))
                if not self.cyclic(new_constraints):
                    constraints = self.add_const((causal_link[2], action), constraints)
                else:
                    # both promotion and demotion fail
                    print('Unable to resolve a threat caused by', action, 'onto', causal_link)
                    return
        return constraints

    def convert(self, constraints):
        """Convert constraints into a dict of Action to set orderings"""

        graph = dict()
        for constraint in constraints:
            if constraint[0] in graph:
                graph[constraint[0]].add(constraint[1])
            else:
                graph[constraint[0]] = set()
                graph[constraint[0]].add(constraint[1])
        return graph

    def toposort(self, graph):
        """Generate topological ordering of constraints"""

        if len(graph) == 0:
            return

        graph = graph.copy()

        for k, v in graph.items():
            v.discard(k)

        extra_elements_in_dependencies = _reduce(set.union, graph.values()) - set(graph.keys())

        graph.update({element:set() for element in extra_elements_in_dependencies})
        while True:
            ordered = set(element for element, dependency in graph.items() if len(dependency) == 0)
            if not ordered:
                break
            yield ordered
            graph = {element: (dependency - ordered) for element, dependency in graph.items() if element not in ordered}
        if len(graph) != 0:
            raise ValueError('The graph is not acyclic and cannot be linearly ordered')

    def display_plan(self):
        """Display causal links, constraints and the plan"""

        print('Causal Links')
        for causal_link in self.causal_links:
            print(causal_link)

        print('\nConstraints')
        for constraint in self.constraints:
            print(constraint[0], '<', constraint[1])

        print('\nPartial Order Plan')
        print(list(reversed(list(self.toposort(self.convert(self.constraints))))))

    def execute(self, display=True):
        """Execute the algorithm"""

        step = 1
        self.tries = 1
        while len(self.agenda) > 0:
            step += 1
            # select <G, act1> from Agenda
            try:
                G, act1, possible_actions = self.find_open_precondition()
            except IndexError:
                print('Probably Wrong')
                break

            act0 = possible_actions[0]
            # remove <G, act1> from Agenda
            self.agenda.remove((G, act1))

            # For actions with variable number of arguments, use least commitment principle
            # act0_temp, bindings = self.find_action_for_precondition(G)
            # act0 = self.generate_action_object(act0_temp, bindings)

            # Actions = Actions U {act0}
            self.actions.add(act0)

            # Constraints = add_const(start < act0, Constraints)
            self.constraints = self.add_const((self.start, act0), self.constraints)

            # for each CL E CausalLinks do
            #   Constraints = protect(CL, act0, Constraints)
            for causal_link in self.causal_links:
                self.constraints = self.protect(causal_link, act0, self.constraints)

            # Agenda = Agenda U {<P, act0>: P is a precondition of act0}
            for precondition in act0.precond:
                self.agenda.add((precondition, act0))

            # Constraints = add_const(act0 < act1, Constraints)
            self.constraints = self.add_const((act0, act1), self.constraints)

            # CausalLinks U {<act0, G, act1>}
            if (act0, G, act1) not in self.causal_links:
                self.causal_links.append((act0, G, act1))

            # for each A E Actions do
            #   Constraints = protect(<act0, G, act1>, A, Constraints)
            for action in self.actions:
                self.constraints = self.protect((act0, G, act1), action, self.constraints)

            if step > 200:
                print('Couldn\'t find a solution')
                return None, None

        if display:
            self.display_plan()
        else:
            return self.constraints, self.causal_links                

We will first describe the data-structures and helper methods used, followed by the algorithm used to find a partial-order plan.

Each plan has the following four components:

  1. actions: a set of actions that make up the steps of the plan. actions is always a subset of pddl.actions the set of possible actions for the given planning problem. The start and finish actions are dummy actions defined to bring uniformity to the problem. The start action has no preconditions and its effects constitute the initial state of the planning problem. The finish action has no effects and its preconditions constitute the goal state of the planning problem. The empty plan consists of just these two dummy actions.
  2. constraints: a set of temporal constraints that define the order of performing the actions relative to each other. constraints does not define a linear ordering, rather it usually represents a directed graph which is also acyclic if the plan is consistent. Each ordering is of the form A < B, which reads as "A before B" and means that action A must be executed sometime before action B, but not necessarily immediately before. constraints stores these as a set of tuples (Action(A), Action(B)) which is interpreted as given above. A constraint cannot be added to constraints if it breaks the acyclicity of the existing graph.
  3. causal_links: a set of causal-links. A causal link between two actions A and B in the plan is written as A --p--> B and is read as "A achieves p for B". This imples that p is an effect of A and a precondition of B. It also asserts that p must remain true from the time of action A to the time of action B. Any violation of this rule is called a threat and must be resolved immediately by adding suitable ordering constraints. causal_links stores this information as tuples (Action(A), precondition(p), Action(B)) which is interpreted as given above. Causal-links can also be called protection-intervals, because the link A --p--> B protects p from being negated over the interval from A to B.
  4. agenda: a set of open-preconditions. A precondition is open if it is not achieved by some action in the plan. Planners will work to reduce the set of open preconditions to the empty set, without introducing a contradiction. agenda stored this information as tuples (precondition(p), Action(A)) where p is a precondition of the action A.

A consistent plan is a plan in which there are no cycles in the ordering constraints and no conflicts with the causal-links. A consistent plan with no open preconditions is a solution.

Let's briefly glance over the helper functions before going into the actual algorithm.
expand_actions: generates all possible actions with variable bindings for use as a heuristic of selection of an open precondition.
find_open_precondition: finds a precondition from the agenda with the least number of actions that fulfil that precondition. This heuristic helps form mandatory ordering constraints and causal-links to further simplify the problem and reduce the probability of encountering a threat.
find_action_for_precondition: finds an action that fulfils the given precondition along with the absolutely necessary variable bindings in accordance with the principle of least commitment. In case of multiple possible actions, the action with the least number of effects is chosen to minimize the chances of encountering a threat.
cyclic: checks if a directed graph is cyclic.
add_const: adds constraint to constraints if the newly formed graph is acyclic and returns constraints otherwise.
is_a_threat: checks if the given effect negates the given precondition.
protect: checks if the given action poses a threat to the given causal_link. If so, the threat is resolved by either promotion or demotion, whichever generates acyclic temporal constraints. If neither promotion or demotion work, the chosen action is not the correct fit or the planning problem cannot be solved altogether.
convert: converts a graph from a list of edges to an Action : set mapping, for use in topological sorting.
toposort: a generator function that generates a topological ordering of a given graph as a list of sets. Each set contains an action or several actions. If a set has more that one action in it, it means that permutations between those actions also produce a valid plan.
display_plan: displays the causal_links, constraints and the partial order plan generated from toposort.

The execute method executes the algorithm, which is summarized below:

  1. An open precondition is selected (a sub-goal that we want to achieve).
  2. An action that fulfils the open precondition is chosen.
  3. Temporal constraints are updated.
  4. Existing causal links are protected. Protection is a method that checks if the causal links conflict and if they do, temporal constraints are added to fix the threats.
  5. The set of open preconditions is updated.
  6. Temporal constraints of the selected action and the next action are established.
  7. A new causal link is added between the selected action and the owner of the open precondition.
  8. The set of new causal links is checked for threats and if found, the threat is removed by either promotion or demotion. If promotion or demotion is unable to solve the problem, the planning problem cannot be solved with the current sequence of actions or it may not be solvable at all.
  9. These steps are repeated until the set of open preconditions is empty.

A partial-order plan can be used to generate different valid total-order plans. This step is called linearization of the partial-order plan. All possible linearizations of a partial-order plan for socks_and_shoes looks like this.

Linearization can be carried out in many ways, but the most efficient way is to represent the set of temporal constraints as a directed graph. We can easily realize that the graph should also be acyclic as cycles in constraints means that the constraints are inconsistent. This acyclicity is enforced by the add_const method, which adds a new constraint only if the acyclicity of the existing graph is not violated. The protect method also checks for acyclicity of the newly-added temporal constraints to make a decision between promotion and demotion in case of a threat. This property of a graph created from the temporal constraints of a valid partial-order plan allows us to use topological sort to order the constraints linearly. A topological sort may produce several different valid solutions for a given directed acyclic graph.

Now that we know how PartialOrderPlanner works, let's solve a few problems using it.


In [5]:
st = spare_tire()
pop = PartialOrderPlanner(st)
pop.execute()


Causal Links
(Action(PutOn(Spare, Axle)), At(Spare, Axle), Action(Finish))
(Action(Start), Tire(Spare), Action(PutOn(Spare, Axle)))
(Action(Remove(Flat, Axle)), NotAt(Flat, Axle), Action(PutOn(Spare, Axle)))
(Action(Start), At(Flat, Axle), Action(Remove(Flat, Axle)))
(Action(Remove(Spare, Trunk)), At(Spare, Ground), Action(PutOn(Spare, Axle)))
(Action(Start), At(Spare, Trunk), Action(Remove(Spare, Trunk)))
(Action(Remove(Flat, Axle)), At(Flat, Ground), Action(Finish))

Constraints
Action(Remove(Flat, Axle)) < Action(PutOn(Spare, Axle))
Action(Start) < Action(Finish)
Action(Remove(Spare, Trunk)) < Action(PutOn(Spare, Axle))
Action(Start) < Action(Remove(Spare, Trunk))
Action(Start) < Action(Remove(Flat, Axle))
Action(Remove(Flat, Axle)) < Action(Finish)
Action(PutOn(Spare, Axle)) < Action(Finish)
Action(Start) < Action(PutOn(Spare, Axle))

Partial Order Plan
[{Action(Start)}, {Action(Remove(Flat, Axle)), Action(Remove(Spare, Trunk))}, {Action(PutOn(Spare, Axle))}, {Action(Finish)}]

We observe that in the given partial order plan, Remove(Flat, Axle) and Remove(Spare, Trunk) are in the same set. This means that the order of performing these actions does not affect the final outcome. That aside, we also see that the PutOn(Spare, Axle) action has to be performed after both the Remove actions are complete, which seems logically consistent.


In [6]:
sbw = simple_blocks_world()
pop = PartialOrderPlanner(sbw)
pop.execute()


Causal Links
(Action(FromTable(C, B)), On(C, B), Action(Finish))
(Action(FromTable(B, A)), On(B, A), Action(Finish))
(Action(Start), OnTable(B), Action(FromTable(B, A)))
(Action(Start), OnTable(C), Action(FromTable(C, B)))
(Action(Start), Clear(C), Action(FromTable(C, B)))
(Action(Start), Clear(A), Action(FromTable(B, A)))
(Action(ToTable(A, B)), Clear(B), Action(FromTable(C, B)))
(Action(Start), On(A, B), Action(ToTable(A, B)))
(Action(ToTable(A, B)), Clear(B), Action(FromTable(B, A)))
(Action(Start), Clear(A), Action(ToTable(A, B)))

Constraints
Action(Start) < Action(FromTable(C, B))
Action(FromTable(B, A)) < Action(FromTable(C, B))
Action(Start) < Action(FromTable(B, A))
Action(Start) < Action(ToTable(A, B))
Action(Start) < Action(Finish)
Action(FromTable(B, A)) < Action(Finish)
Action(FromTable(C, B)) < Action(Finish)
Action(ToTable(A, B)) < Action(FromTable(B, A))
Action(ToTable(A, B)) < Action(FromTable(C, B))

Partial Order Plan
[{Action(Start)}, {Action(ToTable(A, B))}, {Action(FromTable(B, A))}, {Action(FromTable(C, B))}, {Action(Finish)}]

We see that this plan does not have flexibility in selecting actions, ie, actions should be performed in this order and this order only, to successfully reach the goal state.


In [7]:
ss = socks_and_shoes()
pop = PartialOrderPlanner(ss)
pop.execute()


Causal Links
(Action(RightShoe), RightShoeOn, Action(Finish))
(Action(LeftShoe), LeftShoeOn, Action(Finish))
(Action(LeftSock), LeftSockOn, Action(LeftShoe))
(Action(RightSock), RightSockOn, Action(RightShoe))

Constraints
Action(LeftSock) < Action(LeftShoe)
Action(RightSock) < Action(RightShoe)
Action(Start) < Action(RightShoe)
Action(Start) < Action(Finish)
Action(LeftShoe) < Action(Finish)
Action(Start) < Action(RightSock)
Action(Start) < Action(LeftShoe)
Action(Start) < Action(LeftSock)
Action(RightShoe) < Action(Finish)

Partial Order Plan
[{Action(Start)}, {Action(LeftSock), Action(RightSock)}, {Action(LeftShoe), Action(RightShoe)}, {Action(Finish)}]

This plan again doesn't have constraints in selecting socks or shoes. As long as both socks are worn before both shoes, we are fine. Notice however, there is one valid solution,
LeftSock -> LeftShoe -> RightSock -> RightShoe
that the algorithm could not find as it cannot be represented as a general partially-ordered plan but is a specific total-order solution.

Runtime differences

Let's briefly take a look at the running time of all the three algorithms on the socks_and_shoes problem.


In [8]:
ss = socks_and_shoes()

In [9]:
%%timeit
GraphPlan(ss).execute()


198 µs ± 3.53 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [10]:
%%timeit
Linearize(ss).execute()


844 µs ± 23.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [11]:
%%timeit
PartialOrderPlanner(ss).execute(display=False)


258 µs ± 4.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

We observe that GraphPlan is about 4 times faster than Linearize because Linearize essentially runs a GraphPlan subroutine under the hood and then carries out some transformations on the solved planning-graph.
We also find that GraphPlan is slightly faster than PartialOrderPlanner, but this is mainly due to the expand_actions method in PartialOrderPlanner that slows it down as it generates all possible permutations of actions and variable bindings.
Without heuristic functions, PartialOrderPlanner will be atleast as fast as GraphPlan, if not faster, but will have a higher tendency to encounter threats and conflicts which might take additional time to resolve.
Different planning algorithms work differently for different problems.