In [1]:
import itertools
import math
import matplotlib.pyplot as plt
import numpy as np
import random
%matplotlib inline
print("Numpy:", np.__version__)
random.seed(0)
Last week we covered several different exercises including a solver for a simplified version of the travelling salesman problem. Instead of finding a cycle we only find the optimal path, there is no requirement to return to the starting point.
This problem has numerous applications and we may want to use our solution again in other projects. Let's take a look at our solution and then decide how we can make it more useful.
In [65]:
coords = [(0,0), (10,5), (10,10), (5,10), (3,3), (3,7), (12,3), (10,11)]
def distance(coords):
distance = 0
for p1, p2 in zip(coords[:-1], coords[1:]):
distance += ((p2[0] - p1[0]) ** 2 + (p2[1] - p1[1]) ** 2) ** 0.5
return distance
print(distance(coords))
all_options = []
for option in itertools.permutations(coords, len(coords)):
all_options.append((option, distance(option)))
all_options.sort(key=lambda x: x[1])
print('Shortest route:', all_options[0])
print('Number of routes considered:', len(all_options))
Currently, the only way to reuse this code would be to copy and paste it. Our first improvement would be to convert it into a function.
In [66]:
def distance(coords):
distance = 0
for p1, p2 in zip(coords[:-1], coords[1:]):
distance += ((p2[0] - p1[0]) ** 2 + (p2[1] - p1[1]) ** 2) ** 0.5
return distance
def find_best_route(coords):
best_distance = distance(coords)
best = coords
for option in itertools.permutations(coords, len(coords)):
option_distance = distance(option)
if option_distance < best_distance:
best = option
best_distance = option_distance
return (best_distance, best)
In [67]:
coords = [(random.randint(0,1000), random.randint(0,1000)) for i in range(8)]
print(coords)
x = np.array([x[0] for x in coords])
y = np.array([y[1] for y in coords])
plt.plot(x, y, 'r-')
best_distance, best_coords = find_best_route(coords)
best_x = np.array([x[0] for x in best_coords])
best_y = np.array([y[1] for y in best_coords])
plt.plot(best_x, best_y, 'bo-')
plt.show()
These functions are now much easier to reuse in this notebook. The next step is to move them out into their own file.
In [68]:
!cat tsp_solver.py
In [69]:
import tsp_solver
best_distance, best_coords = tsp_solver.find_best_route(coords)
best_x = np.array([x[0] for x in best_coords])
best_y = np.array([y[1] for y in best_coords])
plt.plot(best_x, best_y, 'bo-')
plt.show()
These functions can now be imported into any other notebook or file. We might stop here but if this code is really important we can, and should, go further.
Code is typically read much more often than it is written. Following a simple set of guidelines makes code easier to read. The precise guidelines that are followed is less important than applying them consistently. A consistent writing style makes reading code much easier.
Individual projects and companies may have their own style guides (such as Google Python Style Guide and Twisted) with PEP8 being the style guide used for python itself and the most widely used style guide in the community.
To help with adhering to the guide there are a number of software tools that can automatically check our code. flake8 is one example that checks against the style guide as well as catching some other issues.
In [70]:
!flake8 tsp_solver.py
The style guide suggests leaving two blank lines before each function definition so that it stands out more. This is a simple fix.
In [71]:
!flake8 tsp_solver_pep8.py
At the moment the only way to understand how to use our code is to read it all. There is no indication of how the input variables should be structured or what the function returns. If we wanted to reuse this code in the future we would waste a significant amount of time simply re-familiarizing ourselves with what is happening.
Documentation, both included with the code and separate if necessary, can significantly speed up this process.
In python files we can include documentation at the top of files and with individual functions.
In [72]:
!cat tsp_solver_doc.py
In [73]:
import tsp_solver_doc
In [74]:
tsp_solver_doc.find_best_route?
In [75]:
print(tsp_solver.find_best_route([(0,0), (0,30), (0,20)]))
This gives the correct result but is just one test. For greater confidence we may want to test a number of different potential issues. Packages exist for writing and running suites of tests but for the moment we will use a simple function.
In [76]:
def test_find_best_route(func):
# The straight line test
assert func([(0,0), (0,30), (0,20)])[0] == 30
# Test a triangle
assert func([(0,0), (0,30), (20,30)])[0] == 50
# Test a square
assert func([(0,0), (0,5), (5,5), (5,0)])[0] == 15
# Test negative numbers
assert func([(0,0), (0,-5), (0,5)])[0] == 10
print('All tests pass')
test_find_best_route(tsp_solver.find_best_route)
We have now moved our solver out to its own file, corrected a styling issue, and added documentation. For each change a new file was created and if we continue like this we will lose track of which is the most current version. This is a contrived example but is a real issue. Each time we try a new approach to a problem it is tempting to create a new file in case the original proves superior.
For a small project this might be just about bearable. As a project grows and requires multiple different files it becomes nearly impossible to manually keep track of the dependencies. This is a very old problem, and there are a number of highly developed software tools to keep track of different file versions.
Instead of creating new files whenever any change is tried, the current version is committed to the version control software and changes can then be made directly. If the change is an improvement it can also be committed to the version control software, if not then the changes can be deleted and the original restored.
The most popular version control software at the moment is Git.
There are a number of good tutorials available:
In [77]:
coords = [(random.randint(0,1000), random.randint(0,1000)) for i in range(10)]
print(coords)
In [78]:
timings3 = %timeit -o -r 1 find_best_route(coords[:3])
timings4 = %timeit -o -r 1 find_best_route(coords[:4])
timings5 = %timeit -o -r 1 find_best_route(coords[:5])
timings6 = %timeit -o -r 1 find_best_route(coords[:6])
timings7 = %timeit -o -r 1 find_best_route(coords[:7])
timings8 = %timeit -o -r 1 find_best_route(coords[:8])
timings9 = %timeit -o -r 1 find_best_route(coords[:9])
timings10 = %timeit -o -r 1 find_best_route(coords[:10])
timings = [t.best for t in [timings3, timings4, timings5, timings6, timings7, timings8, timings9, timings10]]
In [79]:
perms = [math.factorial(i) for i in range(3, 11)]
fig, ax = plt.subplots(1,1)
ax.plot(np.log10(perms), np.log10(timings), 'ko')
ax.set_xlabel('log10 n-factorial', size='x-large')
ax.set_ylabel('log10 running times', size='x-large')
plt.show()
Better algorithms have been developed to solve this problem in less time but for very large numbers of points even these algorithms become prohibitive.
A significant improvement in speed is possible if instead of looking for the perfect solution, we instead content ourselves with a good solution.
This is an optimization problem, in which we are attempting to choose a set of parameters (the order of the points) that minimizes a cost function, the distance travelled. This is a very common task in machine learning:
Before looking at the travelling salesman problem, let's look at a simpler problem
In [80]:
def output_function(x):
return 100 - x ** 2
inputs = np.linspace(-20, 20, 200)
plt.plot(inputs, output_function(inputs), 'b-')
plt.show()
One simple approach to optimization is gradient descent. This problem is an ideal candidate for gradient descent because there is only one maximum. From any starting point we can calculate the gradient and follow it up to the maximum.
We will look at gradient descent in greater detail in future classes as it works very well for many machine learning tasks. Unfortunately for the travelling salesman problem it is likely to perform poorly.
In [81]:
coords = [(0,0), (10,5), (10,10), (5,10), (3,3), (3,7), (12,3), (10,11)]
def distance(coords):
distance = 0
for p1, p2 in zip(coords[:-1], coords[1:]):
distance += ((p2[0] - p1[0]) ** 2 + (p2[1] - p1[1]) ** 2) ** 0.5
return distance
print(distance(coords))
all_options = []
for option in itertools.permutations(coords, len(coords)):
all_options.append((option, distance(option)))
#all_options.sort(key=lambda x: x[1])
print('Number of routes considered:', len(all_options))
plt.plot([i[1] for i in all_options])
plt.show()
Unlike the simple example above plotting the distance of each of the possible paths reveals significant variation. Starting at a random point and following the gradient would only get us to the local minimum, the likelihood of this also being the global minimum is very small.
In [82]:
plt.plot([i[1] for i in all_options[:5000]])
plt.show()
In [83]:
plt.plot([i[1] for i in all_options[2000:3000]])
plt.show()
In [84]:
plt.plot([i[1] for i in all_options[2500:2700]])
plt.show()
To find the global minimum our approach will need to explore further than just the local minimum.
We will discuss two different algorithms:
There are a number of packages available for both simulated annealing (SimulatedAnnealing, simanneal) and genetic algorithm (pyevolve, DEAP) but both are simple enough that we can implement them ourselves.
Both approaches require:
In [85]:
def new_path(existing_path):
path = existing_path[:]
point = random.randint(0, len(path)-2)
path[point+1], path[point] = path[point], path[point+1]
return path
print(coords)
print(new_path(coords))
In [92]:
def simulated_annealing_optimizer(starting_path, cost_func, new_path_func, start_temp, min_temp, steps):
current_path = starting_path[:]
current_cost = cost_func(current_path)
temp_factor = -np.log(start_temp / min_temp)
history = []
for s in range(0, steps):
temp = start_temp * np.exp(temp_factor * s / steps)
new_path = new_path_func(current_path)
new_cost = cost_func(new_path)
if (new_cost < current_cost) or (random.random() <= np.exp(-(new_cost - current_cost)/temp)):
current_path = new_path
current_cost = new_cost
record = {'step':s, 'temperature':temp, 'current_cost':current_cost, }
history.append(record)
return (current_path, current_cost, history)
best_path, best_cost, history = simulated_annealing_optimizer(coords, distance, new_path, 1000, 0.01, 1000)
print(best_cost)
plt.plot([i['current_cost'] for i in history])
plt.show()
plt.plot([i[0] for i in best_path], [i[1] for i in best_path])
plt.show()
In [87]:
coords = [(random.randint(0,1000), random.randint(0,1000)) for i in range(15)]
best_path, best_cost, history = simulated_annealing_optimizer(coords, distance, new_path, 1000, 0.01, 10000)
print(best_cost)
plt.plot([i['current_cost'] for i in history])
plt.show()
plt.plot([i[0] for i in best_path], [i[1] for i in best_path])
Out[87]:
In [88]:
def select_best(population, cost_func, num_to_keep):
scored_population = [(i, cost_func(i)) for i in population]
scored_population.sort(key=lambda x: x[1])
return [i[0] for i in scored_population[:num_to_keep]]
def recombine(population):
# Randomly choose two parents
options = list(range(len(population)))
random.shuffle(options)
partner1 = options[0]
partner2 = options[1]
# Choose a split point, take the first parents order to that split point,
# then the second parents order for all remaining points
split_point = random.randint(0, len(population[0])-1)
child = population[partner1][:split_point]
for point in population[partner2]:
if point not in child:
child.append(point)
return child
def genetic_algorithm_optimizer(starting_path, cost_func, new_path_func, pop_size, generations):
# Create a starting population by randomly shuffling the points
population = []
for i in range(pop_size):
new_path = starting_path[:]
random.shuffle(new_path)
population.append(new_path)
history = []
# Take the top 25% of routes and recombine to create new routes, repeating for generations
for i in range(generations):
pop_best = select_best(population, cost_func, int(pop_size / 4))
new_population = []
for i in range(pop_size):
new_population.append(recombine(pop_best))
population = new_population
record = {'generation':i, 'current_cost':cost_func(population[0]),}
history.append(record)
return (population[0], cost_func(population[0]), history)
best_path, best_cost, history = genetic_algorithm_optimizer(coords, distance, new_path, 500, 100)
print(best_cost)
plt.plot([i['current_cost'] for i in history])
plt.show()
plt.plot([i[0] for i in best_path], [i[1] for i in best_path])
plt.show()
In [ ]: