Planning combines the two major areas of AI: search and logic. A planner can be seen either as a program that searches for a solution or as one that constructively proves the existence of a solution.
Currently, the most popular and effective approaches to fully automated planning are:
In [1]:
from planning import *
A planning graph is a directed graph organized into levels each of which contains information about the current state of the knowledge base and the possible state-action links to and from that level.
The first level contains the initial state with nodes representing each fluent that holds in that level. This level has state-action links linking each state to valid actions in that state. Each action is linked to all its preconditions and its effect states. Based on these effects, the next level is constructed and contains similarly structured information about the next state. In this way, the graph is expanded using state-action links till we reach a state where all the required goals hold true simultaneously.
In every planning problem, we are allowed to carry out the no-op action, ie, we can choose no action for a particular state. These are called persistence actions and has effects same as its preconditions. This enables us to carry a state to the next level.
Mutual exclusivity (mutex) between two actions means that these cannot be taken together and occurs in the following cases:
We can say that we have reached our goal if none of the goal states in the current level are mutually exclusive.
In [2]:
%psource Graph
In [3]:
%psource Level
A planning graph can be used to give better heuristic estimates which can be applied to any of the search techniques. Alternatively, we can search for a solution over the space formed by the planning graph, using an algorithm called GraphPlan
.
The GraphPlan
algorithm repeatedly adds a level to a planning graph. Once all the goals show up as non-mutex in the graph, the algorithm runs backward from the last level to the first searching for a plan that solves the problem. If that fails, it records the (level , goals) pair as a no-good (as in constraint learning for CSPs), expands another level and tries again, terminating with failure when there is no reason to go on.
In [4]:
%psource GraphPlan
The description of a planning problem defines a search problem: we can search from the initial state through the space of states, looking for a goal. One of the nice advantages of the declarative representation of action schemas is that we can also search backward from the goal, looking for the initial state.
However, neither forward nor backward search is efficient without a good heuristic function because the real-world planning problems often have large state spaces. A heuristic function $h(s)$ estimates the distance from a state $s$ to the goal and, if it is admissible, ie if does not overestimate, then we can use $A^∗$ search to find optimal solutions.
Planning uses a factored representation for states and action schemas which makes it possible to define good domain-independent heuristics to prune the search space.
An admissible heuristic can be derived by defining a relaxed problem that is easier to solve. The length of the solution of this easier problem then becomes the heuristic for the original problem. Assume that all goals and preconditions contain only positive literals, ie that the problem is defined according to the Stanford Research Institute Problem Solver (STRIPS) notation: we want to create a relaxed version of the original problem that will be easier to solve by ignoring delete lists from all actions, ie removing all negative literals from effects. As shown in [1] the planning graph of a relaxed problem does not contain any mutex relations at all (which is the crucial thing when building a planning graph) and for this reason GraphPlan will never backtrack looking for a solution: for this reason the ignore delete lists heuristic makes it possible to find the optimal solution for relaxed problem in polynomial time through GraphPlan
algorithm.
In [5]:
from search import *
Forward search through the space of states, starting in the initial state and using the problem’s actions to search forward for a member of the set of goal states.
In [6]:
%psource ForwardPlan
Backward search through sets of relevant states, starting at the set of states representing the goal and using the inverse of the actions to search backward for the initial state.
In [7]:
%psource BackwardPlan
In forward planning, the search is constrained by the initial state and only uses the goal as a stopping criterion and as a source for heuristics. In regression planning, the search is constrained by the goal and only uses the start state as a stopping criterion and as a source for heuristics. By converting the problem to a constraint satisfaction problem (CSP), the initial state can be used to prune what is not reachable and the goal to prune what is not useful. The CSP will be defined for a finite number of steps; the number of steps can be adjusted to find the shortest plan. One of the CSP methods can then be used to solve the CSP and thus find a plan.
To construct a CSP from a planning problem, first choose a fixed planning horizon, which is the number of time steps over which to plan. Suppose the horizon is $k$. The CSP has the following variables:
There are several types of constraints:
The PDDL representation gives precondition, effect and frame constraints for each time $t$ as follows:
The CSP representation assumes a fixed planning horizon (ie a fixed number of steps). To find a plan over any number of steps, the algorithm can be run for a horizon of $k = 0, 1, 2, \dots$ until a solution is found.
In [8]:
from csp import *
In [9]:
%psource CSPlan
As shown in [2] the translation of a Planning Domain Definition Language (PDDL) description into a Conjunctive Normal Form (CNF) formula is a series of straightforward steps:
where $ActionCausesF$ is a disjunction of all the ground actions that have $F$ in their add list, and $ActionCausesNotF$ is a disjunction of all the ground actions that have $F$ in their delete list;
A propositional planning procedure implements the basic idea just given but, because the agent does not know how many steps it will take to reach the goal, the algorithm tries each possible number of steps $t$, up to some maximum conceivable plan length $T_{max}$ . In this way, it is guaranteed to find the shortest plan if one exists. Because of the way the propositional planning procedure searches for a solution, this approach cannot be used in a partially observable environment, ie WalkSAT, but would just set the unobservable variables to the values it needs to create a solution.
In [10]:
from logic import *
In [11]:
%psource SATPlan
In [12]:
%psource SAT_plan
In [13]:
%psource three_block_tower
In [14]:
%time blocks_world_solution = GraphPlan(three_block_tower()).execute()
linearize(blocks_world_solution)
Out[14]:
In [15]:
%time blocks_world_solution = uniform_cost_search(ForwardPlan(three_block_tower()), display=True).solution()
blocks_world_solution = list(map(lambda action: Expr(action.name, *action.args), blocks_world_solution))
blocks_world_solution
Out[15]:
In [16]:
%time blocks_world_solution = astar_search(ForwardPlan(three_block_tower()), display=True).solution()
blocks_world_solution = list(map(lambda action: Expr(action.name, *action.args), blocks_world_solution))
blocks_world_solution
Out[16]:
In [17]:
%time blocks_world_solution = uniform_cost_search(BackwardPlan(three_block_tower()), display=True).solution()
blocks_world_solution = list(map(lambda action: Expr(action.name, *action.args), blocks_world_solution))
blocks_world_solution[::-1]
Out[17]:
In [18]:
%time blocks_world_solution = astar_search(BackwardPlan(three_block_tower()), display=True).solution()
blocks_world_solution = list(map(lambda action: Expr(action.name, *action.args), blocks_world_solution))
blocks_world_solution[::-1]
Out[18]:
In [19]:
%time blocks_world_solution = CSPlan(three_block_tower(), 3, arc_heuristic=no_heuristic)
blocks_world_solution
Out[19]:
In [20]:
%time blocks_world_solution = CSPlan(three_block_tower(), 3, arc_heuristic=sat_up)
blocks_world_solution
Out[20]:
In [27]:
%time blocks_world_solution = SATPlan(three_block_tower(), 4, SAT_solver=dpll_satisfiable)
blocks_world_solution
Out[27]:
In [28]:
%time blocks_world_solution = SATPlan(three_block_tower(), 4, SAT_solver=cdcl_satisfiable)
blocks_world_solution
Out[28]:
In [21]:
%psource spare_tire
In [29]:
%time spare_tire_solution = GraphPlan(spare_tire()).execute()
linearize(spare_tire_solution)
Out[29]:
In [30]:
%time spare_tire_solution = uniform_cost_search(ForwardPlan(spare_tire()), display=True).solution()
spare_tire_solution = list(map(lambda action: Expr(action.name, *action.args), spare_tire_solution))
spare_tire_solution
Out[30]:
In [31]:
%time spare_tire_solution = astar_search(ForwardPlan(spare_tire()), display=True).solution()
spare_tire_solution = list(map(lambda action: Expr(action.name, *action.args), spare_tire_solution))
spare_tire_solution
Out[31]:
In [32]:
%time spare_tire_solution = uniform_cost_search(BackwardPlan(spare_tire()), display=True).solution()
spare_tire_solution = list(map(lambda action: Expr(action.name, *action.args), spare_tire_solution))
spare_tire_solution[::-1]
Out[32]:
In [33]:
%time spare_tire_solution = astar_search(BackwardPlan(spare_tire()), display=True).solution()
spare_tire_solution = list(map(lambda action: Expr(action.name, *action.args), spare_tire_solution))
spare_tire_solution[::-1]
Out[33]:
In [34]:
%time spare_tire_solution = CSPlan(spare_tire(), 3, arc_heuristic=no_heuristic)
spare_tire_solution
Out[34]:
In [35]:
%time spare_tire_solution = CSPlan(spare_tire(), 3, arc_heuristic=sat_up)
spare_tire_solution
Out[35]:
In [36]:
%time spare_tire_solution = SATPlan(spare_tire(), 4, SAT_solver=dpll_satisfiable)
spare_tire_solution
Out[36]:
In [37]:
%time spare_tire_solution = SATPlan(spare_tire(), 4, SAT_solver=cdcl_satisfiable)
spare_tire_solution
Out[37]:
In [22]:
%psource shopping_problem
In [45]:
%time shopping_problem_solution = GraphPlan(shopping_problem()).execute()
linearize(shopping_problem_solution)
Out[45]:
In [46]:
%time shopping_problem_solution = uniform_cost_search(ForwardPlan(shopping_problem()), display=True).solution()
shopping_problem_solution = list(map(lambda action: Expr(action.name, *action.args), shopping_problem_solution))
shopping_problem_solution
Out[46]:
In [47]:
%time shopping_problem_solution = astar_search(ForwardPlan(shopping_problem()), display=True).solution()
shopping_problem_solution = list(map(lambda action: Expr(action.name, *action.args), shopping_problem_solution))
shopping_problem_solution
Out[47]:
In [48]:
%time shopping_problem_solution = uniform_cost_search(BackwardPlan(shopping_problem()), display=True).solution()
shopping_problem_solution = list(map(lambda action: Expr(action.name, *action.args), shopping_problem_solution))
shopping_problem_solution[::-1]
Out[48]:
In [49]:
%time shopping_problem_solution = astar_search(BackwardPlan(shopping_problem()), display=True).solution()
shopping_problem_solution = list(map(lambda action: Expr(action.name, *action.args), shopping_problem_solution))
shopping_problem_solution[::-1]
Out[49]:
In [50]:
%time shopping_problem_solution = CSPlan(shopping_problem(), 5, arc_heuristic=no_heuristic)
shopping_problem_solution
Out[50]:
In [51]:
%time shopping_problem_solution = CSPlan(shopping_problem(), 5, arc_heuristic=sat_up)
shopping_problem_solution
Out[51]:
In [11]:
%time shopping_problem_solution = SATPlan(shopping_problem(), 5, SAT_solver=cdcl_satisfiable)
shopping_problem_solution
Out[11]:
In [23]:
%psource air_cargo
In [38]:
%time air_cargo_solution = GraphPlan(air_cargo()).execute()
linearize(air_cargo_solution)
Out[38]:
In [39]:
%time air_cargo_solution = uniform_cost_search(ForwardPlan(air_cargo()), display=True).solution()
air_cargo_solution = list(map(lambda action: Expr(action.name, *action.args), air_cargo_solution))
air_cargo_solution
Out[39]:
In [40]:
%time air_cargo_solution = astar_search(ForwardPlan(air_cargo()), display=True).solution()
air_cargo_solution = list(map(lambda action: Expr(action.name, *action.args), air_cargo_solution))
air_cargo_solution
Out[40]:
In [41]:
%time air_cargo_solution = uniform_cost_search(BackwardPlan(air_cargo()), display=True).solution()
air_cargo_solution = list(map(lambda action: Expr(action.name, *action.args), air_cargo_solution))
air_cargo_solution[::-1]
Out[41]:
In [42]:
%time air_cargo_solution = astar_search(BackwardPlan(air_cargo()), display=True).solution()
air_cargo_solution = list(map(lambda action: Expr(action.name, *action.args), air_cargo_solution))
air_cargo_solution[::-1]
Out[42]:
In [43]:
%time air_cargo_solution = CSPlan(air_cargo(), 6, arc_heuristic=no_heuristic)
air_cargo_solution
Out[43]:
In [44]:
%time air_cargo_solution = CSPlan(air_cargo(), 6, arc_heuristic=sat_up)
air_cargo_solution
Out[44]: