SOLVING PLANNING PROBLEMS


GRAPHPLAN


The GraphPlan algorithm is a popular method of solving classical planning problems. Before we get into the details of the algorithm, let's look at a special data structure called planning graph, used to give better heuristic estimates and plays a key role in the GraphPlan algorithm.

Planning Graph

A planning graph is a directed graph organized into levels. Each level contains information about the current state of the knowledge base and the possible state-action links to and from that level. The first level contains the initial state with nodes representing each fluent that holds in that level. This level has state-action links linking each state to valid actions in that state. Each action is linked to all its preconditions and its effect states. Based on these effects, the next level is constructed. The next level contains similarly structured information about the next state. In this way, the graph is expanded using state-action links till we reach a state where all the required goals hold true simultaneously. We can say that we have reached our goal if none of the goal states in the current level are mutually exclusive. This will be explained in detail later.
Planning graphs only work for propositional planning problems, hence we need to eliminate all variables by generating all possible substitutions.
For example, the planning graph of the have_cake_and_eat_cake_too problem might look like this
The black lines indicate links between states and actions.
In every planning problem, we are allowed to carry out the no-op action, ie, we can choose no action for a particular state. These are called 'Persistence' actions and are represented in the graph by the small square boxes. In technical terms, a persistence action has effects same as its preconditions. This enables us to carry a state to the next level.

The gray lines indicate mutual exclusivity. This means that the actions connected bya gray line cannot be taken together. Mutual exclusivity (mutex) occurs in the following cases:

  1. Inconsistent effects: One action negates the effect of the other. For example, Eat(Cake) and the persistence of Have(Cake) have inconsistent effects because they disagree on the effect Have(Cake)
  2. Interference: One of the effects of an action is the negation of a precondition of the other. For example, Eat(Cake) interferes with the persistence of Have(Cake) by negating its precondition.
  3. Competing needs: One of the preconditions of one action is mutually exclusive with a precondition of the other. For example, Bake(Cake) and Eat(Cake) are mutex because they compete on the value of the Have(Cake) precondition.

In the module, planning graphs have been implemented using two classes, Level which stores data for a particular level and Graph which connects multiple levels together. Let's look at the Level class.


In [14]:
from planning import *
from notebook import psource

In [15]:
psource(Level)


class Level:
    """
    Contains the state of the planning problem
    and exhaustive list of actions which use the
    states as pre-condition.
    """

    def __init__(self, kb):
        """Initializes variables to hold state and action details of a level"""

        self.kb = kb
        # current state
        self.current_state = kb.clauses
        # current action to state link
        self.current_action_links = {}
        # current state to action link
        self.current_state_links = {}
        # current action to next state link
        self.next_action_links = {}
        # next state to current action link
        self.next_state_links = {}
        # mutually exclusive actions
        self.mutex = []

    def __call__(self, actions, objects):
        self.build(actions, objects)
        self.find_mutex()

    def separate(self, e):
        """Separates an iterable of elements into positive and negative parts"""

        positive = []
        negative = []
        for clause in e:
            if clause.op[:3] == 'Not':
                negative.append(clause)
            else:
                positive.append(clause)
        return positive, negative

    def find_mutex(self):
        """Finds mutually exclusive actions"""

        # Inconsistent effects
        pos_nsl, neg_nsl = self.separate(self.next_state_links)

        for negeff in neg_nsl:
            new_negeff = Expr(negeff.op[3:], *negeff.args)
            for poseff in pos_nsl:
                if new_negeff == poseff:
                    for a in self.next_state_links[poseff]:
                        for b in self.next_state_links[negeff]:
                            if {a, b} not in self.mutex:
                                self.mutex.append({a, b})

        # Interference will be calculated with the last step
        pos_csl, neg_csl = self.separate(self.current_state_links)

        # Competing needs
        for posprecond in pos_csl:
            for negprecond in neg_csl:
                new_negprecond = Expr(negprecond.op[3:], *negprecond.args)
                if new_negprecond == posprecond:
                    for a in self.current_state_links[posprecond]:
                        for b in self.current_state_links[negprecond]:
                            if {a, b} not in self.mutex:
                                self.mutex.append({a, b})

        # Inconsistent support
        state_mutex = []
        for pair in self.mutex:
            next_state_0 = self.next_action_links[list(pair)[0]]
            if len(pair) == 2:
                next_state_1 = self.next_action_links[list(pair)[1]]
            else:
                next_state_1 = self.next_action_links[list(pair)[0]]
            if (len(next_state_0) == 1) and (len(next_state_1) == 1):
                state_mutex.append({next_state_0[0], next_state_1[0]})
        
        self.mutex = self.mutex + state_mutex

    def build(self, actions, objects):
        """Populates the lists and dictionaries containing the state action dependencies"""

        for clause in self.current_state:
            p_expr = Expr('P' + clause.op, *clause.args)
            self.current_action_links[p_expr] = [clause]
            self.next_action_links[p_expr] = [clause]
            self.current_state_links[clause] = [p_expr]
            self.next_state_links[clause] = [p_expr]

        for a in actions:
            num_args = len(a.args)
            possible_args = tuple(itertools.permutations(objects, num_args))

            for arg in possible_args:
                if a.check_precond(self.kb, arg):
                    for num, symbol in enumerate(a.args):
                        if not symbol.op.islower():
                            arg = list(arg)
                            arg[num] = symbol
                            arg = tuple(arg)

                    new_action = a.substitute(Expr(a.name, *a.args), arg)
                    self.current_action_links[new_action] = []

                    for clause in a.precond:
                        new_clause = a.substitute(clause, arg)
                        self.current_action_links[new_action].append(new_clause)
                        if new_clause in self.current_state_links:
                            self.current_state_links[new_clause].append(new_action)
                        else:
                            self.current_state_links[new_clause] = [new_action]
                   
                    self.next_action_links[new_action] = []
                    for clause in a.effect:
                        new_clause = a.substitute(clause, arg)

                        self.next_action_links[new_action].append(new_clause)
                        if new_clause in self.next_state_links:
                            self.next_state_links[new_clause].append(new_action)
                        else:
                            self.next_state_links[new_clause] = [new_action]

    def perform_actions(self):
        """Performs the necessary actions and returns a new Level"""

        new_kb = FolKB(list(set(self.next_state_links.keys())))
        return Level(new_kb)

Each level stores the following data

  1. The current state of the level in current_state
  2. Links from an action to its preconditions in current_action_links
  3. Links from a state to the possible actions in that state in current_state_links
  4. Links from each action to its effects in next_action_links
  5. Links from each possible next state from each action in next_state_links. This stores the same information as the current_action_links of the next level.
  6. Mutex links in mutex.

    The find_mutex method finds the mutex links according to the points given above.
    The build method populates the data structures storing the state and action information. Persistence actions for each clause in the current state are also defined here. The newly created persistence action has the same name as its state, prefixed with a 'P'.

Let's now look at the Graph class.


In [16]:
psource(Graph)


class Graph:
    """
    Contains levels of state and actions
    Used in graph planning algorithm to extract a solution
    """

    def __init__(self, planningproblem):
        self.planningproblem = planningproblem
        self.kb = FolKB(planningproblem.init)
        self.levels = [Level(self.kb)]
        self.objects = set(arg for clause in self.kb.clauses for arg in clause.args)

    def __call__(self):
        self.expand_graph()

    def expand_graph(self):
        """Expands the graph by a level"""

        last_level = self.levels[-1]
        last_level(self.planningproblem.actions, self.objects)
        self.levels.append(last_level.perform_actions())

    def non_mutex_goals(self, goals, index):
        """Checks whether the goals are mutually exclusive"""

        goal_perm = itertools.combinations(goals, 2)
        for g in goal_perm:
            if set(g) in self.levels[index].mutex:
                return False
        return True

The class stores a problem definition in pddl, a knowledge base in kb, a list of Level objects in levels and all the possible arguments found in the initial state of the problem in objects.
The expand_graph method generates a new level of the graph. This method is invoked when the goal conditions haven't been met in the current level or the actions that lead to it are mutually exclusive. The non_mutex_goals method checks whether the goals in the current state are mutually exclusive.

Using these two classes, we can define a planning graph which can either be used to provide reliable heuristics for planning problems or used in the GraphPlan algorithm.
Let's have a look at the GraphPlan class.


In [17]:
psource(GraphPlan)


class GraphPlan:
    """
    Class for formulation GraphPlan algorithm
    Constructs a graph of state and action space
    Returns solution for the planning problem
    """

    def __init__(self, planningproblem):
        self.graph = Graph(planningproblem)
        self.nogoods = []
        self.solution = []

    def check_leveloff(self):
        """Checks if the graph has levelled off"""

        check = (set(self.graph.levels[-1].current_state) == set(self.graph.levels[-2].current_state))

        if check:
            return True

    def extract_solution(self, goals, index):
        """Extracts the solution"""

        level = self.graph.levels[index]    
        if not self.graph.non_mutex_goals(goals, index):
            self.nogoods.append((level, goals))
            return

        level = self.graph.levels[index - 1]    

        # Create all combinations of actions that satisfy the goal    
        actions = []
        for goal in goals:
            actions.append(level.next_state_links[goal])    

        all_actions = list(itertools.product(*actions))    

        # Filter out non-mutex actions
        non_mutex_actions = []    
        for action_tuple in all_actions:
            action_pairs = itertools.combinations(list(set(action_tuple)), 2)        
            non_mutex_actions.append(list(set(action_tuple)))        
            for pair in action_pairs:            
                if set(pair) in level.mutex:
                    non_mutex_actions.pop(-1)
                    break
    

        # Recursion
        for action_list in non_mutex_actions:        
            if [action_list, index] not in self.solution:
                self.solution.append([action_list, index])

                new_goals = []
                for act in set(action_list):                
                    if act in level.current_action_links:
                        new_goals = new_goals + level.current_action_links[act]

                if abs(index) + 1 == len(self.graph.levels):
                    return
                elif (level, new_goals) in self.nogoods:
                    return
                else:
                    self.extract_solution(new_goals, index - 1)

        # Level-Order multiple solutions
        solution = []
        for item in self.solution:
            if item[1] == -1:
                solution.append([])
                solution[-1].append(item[0])
            else:
                solution[-1].append(item[0])

        for num, item in enumerate(solution):
            item.reverse()
            solution[num] = item

        return solution

    def goal_test(self, kb):
        return all(kb.ask(q) is not False for q in self.graph.planningproblem.goals)

    def execute(self):
        """Executes the GraphPlan algorithm for the given problem"""

        while True:
            self.graph.expand_graph()
            if (self.goal_test(self.graph.levels[-1].kb) and self.graph.non_mutex_goals(self.graph.planningproblem.goals, -1)):
                solution = self.extract_solution(self.graph.planningproblem.goals, -1)
                if solution:
                    return solution
            
            if len(self.graph.levels) >= 2 and self.check_leveloff():
                return None

Given a planning problem defined as a PlanningProblem, GraphPlan creates a planning graph stored in graph and expands it till it reaches a state where all its required goals are present simultaneously without mutual exclusivity.
Once a goal is found, extract_solution is called. This method recursively finds the path to a solution given a planning graph. In the case where extract_solution fails to find a solution for a set of goals as a given level, we record the (level, goals) pair as a no-good. Whenever extract_solution is called again with the same level and goals, we can find the recorded no-good and immediately return failure rather than searching again. No-goods are also used in the termination test.
The check_leveloff method checks if the planning graph for the problem has levelled-off, ie, it has the same states, actions and mutex pairs as the previous level. If the graph has already levelled off and we haven't found a solution, there is no point expanding the graph, as it won't lead to anything new. In such a case, we can declare that the planning problem is unsolvable with the given constraints.

To summarize, the GraphPlan algorithm calls expand_graph and tests whether it has reached the goal and if the goals are non-mutex.
If so, extract_solution is invoked which recursively reconstructs the solution from the planning graph.
If not, then we check if our graph has levelled off and continue if it hasn't.

Let's solve a few planning problems that we had defined earlier.

Air cargo problem

In accordance with the summary above, we have defined a helper function to carry out GraphPlan on the air_cargo problem. The function is pretty straightforward. Let's have a look.


In [18]:
psource(air_cargo_graphplan)


def air_cargo_graphplan():
    """Solves the air cargo problem using GraphPlan"""
    return GraphPlan(air_cargo()).execute()

Let's instantiate the problem and find a solution using this helper function.


In [19]:
airCargoG = air_cargo_graphplan()
airCargoG


Out[19]:
[[[Load(C2, P2, JFK),
   PAirport(SFO),
   PAirport(JFK),
   PPlane(P2),
   PPlane(P1),
   Fly(P2, JFK, SFO),
   PCargo(C2),
   Load(C1, P1, SFO),
   Fly(P1, SFO, JFK),
   PCargo(C1)],
  [Unload(C2, P2, SFO), Unload(C1, P1, JFK)]]]

Each element in the solution is a valid action. The solution is separated into lists for each level. The actions prefixed with a 'P' are persistence actions and can be ignored. They simply carry certain states forward. We have another helper function linearize that presents the solution in a more readable format, much like a total-order planner, but it is not a total-order planner.


In [20]:
linearize(airCargoG)


Out[20]:
[Load(C2, P2, JFK),
 Fly(P2, JFK, SFO),
 Load(C1, P1, SFO),
 Fly(P1, SFO, JFK),
 Unload(C2, P2, SFO),
 Unload(C1, P1, JFK)]

Indeed, this is a correct solution.
There are similar helper functions for some other planning problems.
Lets' try solving the spare tire problem.


In [21]:
spareTireG = spare_tire_graphplan()
linearize(spareTireG)


Out[21]:
[Remove(Spare, Trunk), Remove(Flat, Axle), PutOn(Spare, Axle)]

Solution for the cake problem


In [22]:
cakeProblemG = have_cake_and_eat_cake_too_graphplan()
linearize(cakeProblemG)


Out[22]:
[Eat(Cake), Bake(Cake)]

Solution for the Sussman's Anomaly configuration of three blocks.


In [23]:
sussmanAnomalyG = three_block_tower_graphplan()
linearize(sussmanAnomalyG)


Out[23]:
[MoveToTable(C, A), Move(B, Table, C), Move(A, Table, B)]

Solution of the socks and shoes problem


In [24]:
socksShoesG = socks_and_shoes_graphplan()
linearize(socksShoesG)


Out[24]:
[RightSock, LeftSock, RightShoe, LeftShoe]