This is the notebook for the Module 7 Programming Assignment.
Here are a few tips for using the iPython HTML notebook:
You should rename this notebook to be <your JHED id>_PR7.ipynb Do it right now.
Make certain the entire notebook executes before you submit it. (See Hint #5 above)
Change the following variables:
In [1]:
name = "Shehzad Qureshi"
jhed_id = "squresh6"
if name == "Student Name" or jhed_id == "sname1":
raise Exception( "Change the name and/or JHED ID...preferrably to yours.")
Add whatever additional imports you require here. Stick with the standard libraries and those required by the class. The import gives you access to these functions: http://ipython.org/ipython-doc/stable/api/generated/IPython.core.display.html (Copy this link) Which, among other things, will permit you to display HTML as the result of evaluated code (see HTML() or display_html()).
In [2]:
from IPython.core.display import *
from copy import deepcopy
In past semesters, I've assigned a regular homework problem for this week which involved doing regression planning and graphplan by hand. The students don't generally like it and I don't generally like grading it.
This semester, we're going to try something different. With the search algorithms from Module 1 and the unification algorithms from Module 6, you're going to implement a forward (progression) planner. The first step should be to grab all of the essential code from your Module 6 notebook for unification and put it in this one cell. You don't even need comments. You might also want to implement the parser, too to make things a bit easier.
Unification code here
In [3]:
def is_constant(x):
return type(x) == str and not x.startswith("?")
def is_variable(x):
return type(x) == str and x.startswith("?")
def is_list(x):
return type(x) == list
def get_varname(var):
return var[1:]
def make_var(var):
return "?{0}".format(var)
def get_value(val):
if is_list(val):
lval, rval = val[0], val[1:]
if is_list(rval[0]): return "{0}({1})".format(lval, get_value(rval[0]))
else: return "{0}({1})".format(lval, ",".join(rval))
else:
return val
def pair(x, y):
var = get_varname(x)
val = get_value(y)
return "{0}/{1}".format(var, val)
def occurs(x, y):
return True if x in y else False
def parse_substitution(sub):
var = sub.split("/")
return (var[0], var[1])
def propagate(x, sub):
if len(sub) == 0 or len(x) == 0: return x
var, value = parse_substitution(sub[0])
for i, xval in enumerate(x):
if is_list(xval): x[i] = propagate(xval, sub)
elif make_var(var) in xval:
x[i] = value
return x
def unify_variable(x, y, th):
if occurs(x, y): return None
else:
if pair(x, y) not in th: th.append(pair(x, y))
return th
def unify_expression(x, y, th):
if th is None: return th
elif is_list(x) and is_list(y) and len(x) == 0 and len(y) == 0: return th if x == y else None
elif is_constant(x) and is_constant(y): return th if x == y else None
elif is_variable(x): return unify_variable(x, y, th)
elif is_variable(y): return unify_variable(y, x, th)
result1 = unify(x[0], y[0])
if result1 is None: return None
if len(result1) > 0: x[1:], y[1:] = propagate(x[1:], result1), propagate(y[1:], result1)
result2 = unify(x[1:], y[1:])
if result2 is None: return None
return result1 + result2
def unify(x, y):
th = list()
return unify_expression(x, y, th)
def parse_compound(expr):
vals = expr[:-1].split("(", 1)
for i, val in enumerate(vals):
if "(" in val and ")" in val:
vals[i] = parse_compound(vals[i])
return vals
def parse(expr):
svalues = expr[1:-1].split(" ")
for i, val in enumerate(svalues):
if "(" in val and ")" in val:
svalues[i] = parse_compound(val)
return svalues
Now on to the main task. If you look in your book you will not find pseudocode for a forward planner. It just says "use state space search" but this is less than helpful and it's a bit more complicated than that.
At a high level, a forward planner takes the current state of the world $S_0$ and attempts to derive a plan, basically by Depth First Search. We have all the ingredients we said we would need in Module 1: states, actions, a transition function and a goal test. We have a set of predicates that describe a state (and therefore all possible states), we have actions and we have, at least, an implicit transition function: applying an action in a state causes the state to change as described by the add and delete lists.
Let's take an example where $S_0$ = item( Drill), place(Home), agent(Me), at(Me, Home)
and at(Drill, Store)
. Which we might represent as follows:
initial_state = [
["item", "Drill"],
["place", "Home"],
["agent", "Me"],
["at", "Me", "Home"],
["at", "Drill", "Store"]
]
And we have a goal state:
goal_state = [
["item", "Drill"],
["place", "Home"],
["place", "Store"],
["agent", "Me"],
["at", "Me", "Home"],
["at", "Drill", "Me"]
]
And we have the following actions:
actions = {
"drive": {
"action": ["drive", "?agent", "?from", "?to"],
"conditions": [
["agent" "?agent"],
["place", "?from"],
["place", "?to"],
["at", "?from"]
]
"add": [
["at", "?to"]
]
"delete": [
["at", "?from"]
]
},
"buy": {
"action": ["buy", "?purchaser", "?seller", "?item"],
"conditions": [
["item", "?item"],
["place", "?seller"],
["agent", "?purchaser"],
["at", "?item", "?seller"],
["at", "?purchaser", "?seller"]
],
"add": [
["at", "?item", "?purchaser"]
]
"delete": [
["at", "?item", "?seller"]
]
}
}
So, there is an outer level of search that is exactly what I describe here: you have a state, you generate successor states by applying actions to the current state, you examine those successor states as we did at the first week of the semester and if one is the goal you stop, if you see a repeat state, you put it on the explored list (you should implement graph search not tree search). What could be simpler?
It turns out the Devil is in the details. There is an inner level of search hidden in "you generate successor states by applying actions to the current state". Where?
How do you know if an action applies in a state? Only if the preconditions successfully unify with the current state. That seems easy enough...you check each action to see if it unifies with the current state and if it does, you use the substitution list on the action, the add and delete lists and create the successor state based on them.
Except for one small problem...there may be more than one way to unify an action with the current state. You must essentially search for all successful unifications of the candidate action and the current state. This is where my question through the semester appliesm, "how would you modify state space search to return all the paths to the goal?"
Unification can be seen as state space search by trying to unify the first precondition with the current state, progressively working your way through the precondition list. If you fail at any point, you may need to backtrack because there might have been another unification of that predicate that would succeed. Similarly, as already mentioned, there may be more than one.
So...by using unification and a properly defined successors
function, you should be able to apply graph based search to the problem and return a "path" through the states from the initial state to the goal. You'll definitely want to use graph-based search since drive(Me, Store), drive(Me, Home), drive(Me, Store), drive(Me, Home), drive(Me, Store), buy( Me, Store, Drill), drive(Me, Home)
is a valid plan.
Your function should return the plan...but if you pass an extra debug=True parameter, it should also return the intermediate states.
To make this easier on yourself, you may want to go ahead and implement that parser...
All the usual directions apply. Functions need documentation of both what they do and the AI concept they represent/implement (if any), they should be as short as possible < 15 lines. Finish up with 2-3 paragraphs about the assignment/module.
We first need a function to determine if we have reached our target state. This function checks if every predicate in state is in goal. This also assumes that state is a subset of goal.
In [4]:
def check_state(state, goal):
return all([True if x in goal else False for x in state])
We will be keeping track of our action plan and populating our unified variables as we go along. This function does that for all parts of an action plan.
In [5]:
def populate_action(action, sub):
a = deepcopy(action)
a["action"] = propagate(a["action"], sub)
a["conditions"] = propagate(a["conditions"], sub)
a["add"] = propagate(a["add"], sub)
a["delete"] = propagate(a["delete"], sub)
return a
Next we need to determine the successor state after reaching a target goal. This function deletes a predicate from the current state and adds a new one based on the action plan chosen.
In [6]:
def next_state(state, action):
for a in action["delete"]:
if a in state: state.remove(a)
for a in action["add"]:
state.append(a)
return state
This function does basic forward checking.
In [7]:
def forward_check(sub, state, preconditions, debug=False):
var, value = parse_substitution(sub[0])
for p in preconditions:
if make_var(var) in p:
s = propagate([p], sub)
if s[0] in state:
return True
if debug: print "Forward Checking {0} failed".format(sub[0])
return False
A wrapper function for doing inference. This does memoization as well as call forward_check
above.
In [8]:
def inference(sub, action, state, debug=False):
failed = action["failed"]
preconditions = deepcopy(action["conditions"])
if sub[0] in failed: return False
if forward_check(sub, state, preconditions, debug=debug) is False:
failed.append(sub[0])
return False
return True
This function checks for terminality of the action plan.
In [9]:
def check_action(action, state, plan):
if check_state(action["conditions"], state) is False: return False
if len([x for x in action["conditions"] if action["conditions"].count(x) > 1]) > 0: return False
if action["action"] in plan: return False
return True
This is our function for backtracking search for precondition checking. Initially no inference was applied which resulted in a lot of redundant branching.
In [10]:
def backtrack_condition(state, goal, actions, plan, action, debug=False):
for c in action["conditions"]:
for v in filter(None, [unify(c, x) for x in state]):
if inference(v, actions[action["action"][0]], state, debug=debug) is False: continue
a = populate_action(deepcopy(action), v)
if debug: print "Checking ", a["conditions"]
if check_action(a, state, plan):
if debug: print "Added ", a["action"]
plan.append(a["action"])
return backtrack_action(next_state(state, a), goal, actions, plan, debug=debug)
else:
result, a = backtrack_condition(state, goal, actions, plan, a, debug=debug)
if result: return result, plan
return False, plan
This is our function for backtracking search for action plans. It is a simple recursive search that tries each action in order.
In [11]:
def backtrack_action(state, goal, actions, plan, debug=False):
if check_state(state, goal): return True, plan
for action in actions:
if debug: print "---\nState ", state, "\nTrying ", actions[action]["action"]
result, plan = backtrack_condition(state, goal, actions, plan, deepcopy(actions[action]), debug=debug)
if result: return result, plan
return False, plan
The main entry point for our planner. We add a "failed" list to the action structure to use for memoization.
In [12]:
def forward_planner(start_state, goal, actions, debug=False):
for action in actions:
actions[action]["failed"] = []
result, plan = backtrack_action(deepcopy(start_state), goal, actions, [], debug=debug)
return plan if result else None
Testing
Our standard test from the assignment. We need to go to the store to buy a drill, then return home.
In [13]:
actions = {
"drive": {
"action": ["drive", "?agent", "?from", "?to"],
"conditions": [
["agent", "?agent"],
["place", "?from"],
["place", "?to"],
["at", "?agent", "?from"]
],
"add": [
["at", "?agent", "?to"]
],
"delete": [
["at", "?agent", "?from"]
]
},
"buy": {
"action": ["buy", "?purchaser", "?seller", "?item"],
"conditions": [
["item", "?item"],
["place", "?seller"],
["agent", "?purchaser"],
["at", "?item", "?seller"],
["at", "?purchaser", "?seller"]
],
"add": [
["at", "?item", "?purchaser"]
],
"delete": [
["at", "?item", "?seller"]
]
}
}
In [14]:
initial_state = [
["item", "Drill"],
["place", "Home"],
["place", "Store"],
["agent", "Me"],
["at", "Me", "Home"],
["at", "Drill", "Store"]
]
goal_state = [
["item", "Drill"],
["place", "Home"],
["place", "Store"],
["agent", "Me"],
["at", "Me", "Home"],
["at", "Drill", "Me"]
]
forward_planner(initial_state, goal_state, actions, debug=True)
Out[14]:
We're going to add another item to buy at the store, to see how the algorithm performs.
In [15]:
initial_state = [
["item", "Drill"],
["item", "Paint"],
["place", "Home"],
["place", "Store"],
["agent", "Me"],
["at", "Me", "Home"],
["at", "Drill", "Store"],
["at", "Paint", "Store"]
]
goal_state = [
["item", "Drill"],
["item", "Paint"],
["place", "Home"],
["place", "Store"],
["agent", "Me"],
["at", "Me", "Home"],
["at", "Drill", "Me"],
["at", "Paint", "Me"]
]
forward_planner(initial_state, goal_state, actions)
Out[15]:
Let's attempt to buy some Milk from the Grocery store.
In [16]:
initial_state = [
["item", "Drill"],
["item", "Milk"],
["place", "Home"],
["place", "Store"],
["place", "Grocery"],
["agent", "Me"],
["at", "Me", "Home"],
["at", "Drill", "Store"],
["at", "Milk", "Grocery"]
]
goal_state = [
["item", "Drill"],
["item", "Milk"],
["place", "Home"],
["place", "Store"],
["place", "Grocery"],
["agent", "Me"],
["at", "Me", "Home"],
["at", "Drill", "Me"],
["at", "Milk", "Me"]
]
forward_planner(initial_state, goal_state, actions)
Out[16]:
Observations
The forward planning algorithm implemented above uses a two-level backtracking search algorithm. Backtracking is performed in the action state space as well as the precondition state space. The implementation starts by considering an action, and then testing the preconditions to see if it works. This is the first level of backtracking search. The second level of backtracking search occurs when testing for preconditions. It tries to unify possibilities with the list of preconditions and matches them up with the current state. It does this one at a time though, so as to properly backtrack between space states.
Initially backtracking had no inference implemented and so it would generate every single possible state. This becomes very inefficient when the number of predicates and actions increase. Inference was implemented to reduce the number of successor states when searching among precondition states. The first inference check is similar to AC3. It checks the potential variable to see if it indeed can be used in the current state, and if not prevents generation of successor nodes based on this state. This cuts down redundancy by a significant factor.
The second inference check was in the form of memoization. The algorithm keeps track of failed (nonsensical) unifications and prevents successor states from being generated. This only offers a minor efficiency improvement though. Even without memoization, forward checking would prevent this variable from being processed, albeit in a more expensive way.
The order of picking actions is very important, apparent. I did try to implement some sort of Minimum Remaining Value heuristic when choosing an action but the algorithm get looping between driving to the Store and Home. Coupled with the fact that my search algorithm prevents duplicate entries in plans, the algorithm never returned a result. There should be some smarter way of picking which action and which precondition to eliminate first, alas, there wasn't enough time to explore these options.