The notebook is better viewed rendered as slides. You can convert it to slides and view them by:

- using nbconvert with a command like:
$ ipython nbconvert --to slides --post serve <this-notebook-name.ipynb>

- installing Reveal.js - Jupyter/IPython Slideshow Extension
- using the online IPython notebook slide viewer (some slides of the notebook might not be properly rendered).

This and other related IPython notebooks can be found at the course github repository:

Given a set of cities, and the distances between each pair of cities, find atourof the cities with the minimum total distance. Atourmeans you start at one city, visit every other city exactly once, and then return to the starting city.

- This notebook relies on Peter Norvig's IPython notebook on the traveling salesperson problem.
- I will be showing how to apply evolutionary algorithms to solve the TSP.

- This is a well-known
*intractable*problem, meaning that there are no efficient solutions that work for a large number of cities. - We can create an inefficient algorithm that works fine for a small number of cites (about a dozen).
- We can also find a
*nearly*-shortest tour over thousands of cities. - Actually, the fact there is no efficient algorithm is liberating:
**This means that we can use a very simple, inefficient algorithm and not feel too bad about it.**

**City**: For the purpose of this exercise, a city is "atomic" in the sense that we don't have to know anything about the components or attributes of a city, just how far it is from other cities.**Cities**: We will need to represent a set of cities; Python's`set`

datatype might be appropriate for that.**Distance**: We will need the distance between two cities. If`A`

and`B`

are cities. This could be done with a function,`distance(A, B)`

, or with a dict,`distance[A][B]`

or`distance[A, B]`

, or with an array if`A`

and`B`

are integer indexes. The resulting distance will be a real number (which Python calls a`float`

).**Tour**: A tour is an ordered list of cities; Python's`list`

or`tuple`

datatypes would work.**Total distance**: The sum of the distances of adjacent cities in the tour. We will probably have a function,`total_distance(tour)`

.

```
In [1]:
```import matplotlib.pyplot as plt
import matplotlib.colors as colors
import matplotlib.cm as cmx
import random, operator
import time
import itertools
import numpy
import math
%matplotlib inline
random.seed(time.time()) # planting a random seed

Generate all the possible tours of the cities, and choose the shortest one (the tour with the minimum total distance).

We can implement this as the Python function `exact_TSP`

(TSP is the standard abbreviation for Traveling Salesperson Problem, and "exact" means that it finds the shortest tour, exactly, not just an approximation to the shortest tour). Here's the design philosophy we will use:

Write Python code that closely mirrors the English description of the algorithm. This will probably require some auxilliary functions and data structures; just assume we will be able to define them as well, using the same design philosophy.

```
In [2]:
```def exact_TSP(cities):
"Generate all possible tours of the cities and choose the shortest one."
return shortest(alltours(cities))
def shortest(tours):
"Return the tour with the minimum total distance."
return min(tours, key=total_distance)

*Note 1*: We have not yet defined the function `total_distance`

, nor `alltours`

.

*Note 2*: In Python `min(`

*collection*`,key=`

*function*`)`

means to find the element *x* that is a member of *collection* such that *function(x)* is minimized. So `shortest`

finds the tour whose `total_distance`

in the minimal among the tours. So our Python code implements (and closely mimics) our English description of the algorithm. Now we need to define what a tour is, and how to measure total distance.

- A tour starts in one city, and then visits each of the other cities in order, before finally retirning to the start.
- A natural representation of the set of available cities is a Python
`set`

, and a natural representation of a tour is a sequence that is a*permutation*of the set. - The tuple
`(1, 2, 3)`

, for example, represents a tour that starts in city 1, moves to 2, then 3, and then returns to 1 to finish the tour.

```
In [3]:
```alltours = itertools.permutations # The permutation function is already defined in the itertools module
cities = {1, 2, 3}
list(alltours(cities))

```
Out[3]:
```

Now for the notion of *distance*. We define `total_distance(tour)`

as the sum of the distances between consecutive cities in the tour; that part is shown below and is easy (with one Python-specific trick: when `i`

is 0, then `distance(tour[0], tour[-1])`

gives us the wrap-around distance between the first and last cities, because `tour[-1]`

is the last element of `tour`

).

```
In [4]:
```def total_distance(tour):
"The total distance between each pair of consecutive cities in the tour."
return sum(distance(tour[i], tour[i-1])
for i in range(len(tour)))

Before we can define `distance(A, B)`

, the distance between two cities, we have to make a choice. In the fully general version of the TSP problem, the distance between two cities could be anything: it could be the amount of time it takes to travel between cities, the number of dollars it costs, or anything else.

How will we represent a two-dimensional point? Here are some choices, with their pros and cons:

**Tuple:**A point (or city) is a two-tuple of (*x*,*y*) coordinates, for example,`(300, 0)`

.**Pro:**Very simple, easy to break a point down into components. Reasonably efficient.**Con:**doesn't distinguish points from other two-tuples. If`p`

is a point, can't do`p.x`

or`p.y`

.

**class:**Define`City`

as a custom class with*x*and*y*fields.**Pro:**explicit, gives us`p.x`

accessors.**Con:**less efficient because of the overhead of creating user-defined objects.

**complex:**Python already has the two-dimensional point as a built-in numeric data type, but in a non-obvious way: as*complex numbers*, which inhabit the two-dimensional (real × complex) plane. We can make this use more explicit by defining "`City = complex`

", meaning that we can construct the representation of a city using the same constructor that makes complex numbers.**Pro:**most efficient, because it uses a builtin type that is already a pair of numbers. The distance between two points is simple: the absolute value of their difference.**Con:**it may seem confusing to bring complex numbers into play; can't say`p.x`

.

**subclass:**Define "`class Point(complex): pass`

", meaning that points are a subclass of complex numbers.**Pro:**All the pros of using`complex`

directly, with the added protection of making it more explicit that these are treated as points, not as complex numbers.**Con:**less efficient than using`complex`

directly; still can't do`p.x`

or`p.y`

.

**subclass with properties:**Define "`class Point(complex): x, y = property(lambda p: p.real), property(lambda p: p.imag)`

".**Pro:**All the pros of previous approach, and we can finally say`p.x`

.**Con:**less efficient than using`complex`

directly.

From possible alternatives Peter chose to go with `complex`

numbers:

```
In [5]:
```City = complex # Constructor for new cities, e.g. City(300, 400)

```
In [6]:
```def distance(A, B):
"The Euclidean distance between two cities."
return abs(A - B)

```
In [7]:
```A = City(300, 0)
B = City(0, 400)
distance(A, B)

```
Out[7]:
```

```
In [8]:
```def generate_cities(n):
"Make a set of n cities, each with random coordinates."
return set(City(random.randrange(10, 890),
random.randrange(10, 590))
for c in range(n))

```
In [9]:
```cities8, cities10, cities100, cities1000 = generate_cities(8), generate_cities(10), generate_cities(100), generate_cities(1000)
cities8

```
Out[9]:
```

A cool thing is to be able to plot a tour

```
In [10]:
```def plot_tour(tour, alpha=1, color=None):
# Plot the tour as blue lines between blue circles, and the starting city as a red square.
plotline(list(tour) + [tour[0]], alpha=alpha, color=color)
plotline([tour[0]], 'rs', alpha=alpha)
# plt.show()
def plotline(points, style='bo-', alpha=1, color=None):
"Plot a list of points (complex numbers) in the 2-D plane."
X, Y = XY(points)
if color:
plt.plot(X, Y, style, alpha=alpha, color=color)
else:
plt.plot(X, Y, style, alpha=alpha)
def XY(points):
"Given a list of points, return two lists: X coordinates, and Y coordinates."
return [p.real for p in points], [p.imag for p in points]

We are ready to test our algorithm

```
In [11]:
```tour = exact_TSP(cities8)

```
In [12]:
```plot_tour(tour)

```
```

The permutation `(1, 2, 3)`

represents the tour that goes from 1 to 2 to 3 and back to 1. You may have noticed that there aren't really six different tours of three cities: the cities 1, 2, and 3 form a triangle; any tour must connect the three points of the triangle; and there are really only two ways to do this: clockwise or counterclockwise. In general, with $n$ cities, there are $n!$ (that is, $n$ factorial) permutations, but only $(n-1)!$, tours that are *distinct*: the tours `123`

, `231`

, and `312`

are three ways of representing the *same* tour.

So we can make our `TSP`

program $n$ times faster by never considering redundant tours. Arbitrarily, we will say that all tours must start with the "first" city in the set of cities. We don't have to change the definition of `TSP`

—just by making `alltours`

return only nonredundant tours, the whole program gets faster.

(While we're at it, we'll make tours be represented as lists, rather than the tuples that are returned by `permutations`

. It doesn't matter now, but later on we will want to represent *partial* tours, to which we will want to append cities one by one; that can only be done to lists, not tuples.)

```
In [13]:
```def all_non_redundant_tours(cities):
"Return a list of tours, each a permutation of cities, but each one starting with the same city."
start = first(cities)
return [[start] + list(tour)
for tour in itertools.permutations(cities - {start})]
def first(collection):
"Start iterating over collection, and return the first element."
for x in collection: return x
def exact_non_redundant_TSP(cities):
"Generate all possible tours of the cities and choose the shortest one."
return shortest(all_non_redundant_tours(cities))

```
In [14]:
```all_non_redundant_tours({1, 2, 3})

```
Out[14]:
```

```
In [15]:
```%timeit exact_TSP(cities8)

```
```

```
In [16]:
```%timeit exact_non_redundant_TSP(cities8)

```
```

```
In [17]:
```%timeit exact_non_redundant_TSP(cities10)

```
```

It takes a few seconds on my machine to solve this problem. In general, the function `exact_non_redundant_TSP()`

looks at $(n-1)!$ tours for an $n$-city problem, and each tour has $n$ cities, so the time for $n$ cities should be roughly proportional to $n!$. This means that the time grows rapidly with the number of cities; we'd need longer than the **age of the Universe** to run `exact_non_redundant_TSP()`

on just 24 cities:

n cities | time |
---|---|

10 | 3 secs |

12 | 3 secs × 12 × 11 = 6.6 mins |

14 | 6.6 mins × 13 × 14 = 20 hours |

24 | 3 secs × 24! / 10! = 16 billion years |

There must be a better way... or at least we need to look for it until quantum computing comes around.

- The
*general, exact*Traveling Salesperson Problem is intractable; - there is no efficient algorithm to find the tour with minimum total distance.
- But if we restrict ourselves to Euclidean distance and if we are willing to settle for a tour that is
*reasonably*short but not the shortest, then the news is much better.

We will consider several *approximate* algorithms, which find tours that are usually within 10 or 20% of the shortest possible and can handle thousands of cities in a few seconds.

Here is our first approximate algorithm:

Start at any city; at each step extend the tour by moving from the previous city to its nearest neighbor that has not yet been visited.

This is called a *greedy algorithm*, because it greedily takes what looks best in the short term (the nearest neighbor) even when that won't always be the best in the long term.

To implement the algorithm I need to represent all the noun phrases in the English description:

**start**: a city which is arbitrarily the first city;**the tour**: a list of cities, initialy just the start city);**previous city**: the last element of tour, that is,`tour[-1]`

);**nearest neighbor**: a function that, when given a city, A, and a list of other cities, finds the one with minimal distance from A); and**not yet visited**: we will keep a set of unvisited cities; initially all cities but the start city are unvisited).

`C`

, and add it to the tour and remove it from `unvisited`

.

```
In [18]:
```def greedy_TSP(cities):
"At each step, visit the nearest neighbor that is still unvisited."
start = first(cities)
tour = [start]
unvisited = cities - {start}
while unvisited:
C = nearest_neighbor(tour[-1], unvisited)
tour.append(C)
unvisited.remove(C)
return tour

```
In [19]:
```def nearest_neighbor(A, cities):
"Find the city in cities that is nearest to city A."
return min(cities, key=lambda x: distance(x, A))

(In Python, as in the formal mathematical theory of computability, `lambda`

is the symbol for *function*, so "`lambda x: distance(x, A)`

" means the function of `x`

that computes the distance from `x`

to the city `A`

. The name `lambda`

comes from the Greek letter λ.)

We can compare the fast approximate `greedy_TSP`

algorithm to the slow `exact_TSP`

algorithm on a small map, as shown below. (If you have this page in a IPython notebook you can repeatedly `run`

the cell, and see how the algorithms compare. `Cities(9)`

will return a different set of cities each time. I ran it 20 times, and only once did the greedy algorithm find the optimal solution, but half the time it was within 10% of optimal, and it was never more than 25% worse than optimal.)

```
In [20]:
```cities = generate_cities(9)

```
In [21]:
```%timeit exact_non_redundant_TSP(cities)

```
```

```
In [22]:
```plot_tour(exact_non_redundant_TSP(cities))

```
```

```
In [23]:
```%timeit greedy_TSP(cities)

```
```

```
In [24]:
```plot_tour(greedy_TSP(cities))

```
```

```
In [25]:
```%timeit greedy_TSP(cities100)

```
```

```
In [26]:
```plot_tour(greedy_TSP(cities100))

```
```

```
In [27]:
```%timeit greedy_TSP(cities1000)

```
```

```
In [28]:
```plot_tour(greedy_TSP(cities1000))

```
```

A greedy algorithm is an algorithm that follows the problem solving heuristic of making the locally optimal choice at each stage with the hope of finding a global optimum. In many problems, a greedy strategy does not in general produce an optimal solution, but nonetheless a greedy heuristic may yield locally optimal solutions that approximate a global optimal solution in a reasonable time.

For many problmes greedy algorithms fail to produce the optimal solution, and may even produce the *unique worst possible solution*. One example is the traveling salesman problem mentioned above: for each number of cities, there is an assignment of distances between the cities for which the nearest neighbor heuristic produces the unique worst possible tour.

- We have seen in class some examples of nature-inspired metaheuristics.
- They are an option in which we dedicate a little more computational effort in order to produce better solutions than
`greedy_TSP()`

.

We will be using the DEAP library to code this tackle this problem using a genetic algorithm.

```
In [29]:
```from deap import algorithms, base, creator, tools

**Individual representation**(binary, floating-point, etc.);**evaluation**and**fitness assignment**;**selection**, that establishes a partial order of individuals in the population using their fitness function value as reference and determines the degree at which individuals in the population will take part in the generation of new (offspring) individuals.**variation**, that applies a range of evolution-inspired operators, like crossover, mutation, etc., to synthesize offspring individuals from the current (parent) population. This process is supposed to prime the fittest individuals so they play a bigger role in the generation of the offspring.**stopping criterion**, that determines when the algorithm shoulod be stopped, either because the optimum was reach or because the optimization process is not progressing.

```
def evolutionary_algorithm():
'Pseudocode of an evolutionary algorithm'
populations = [] # a list with all the populations
populations[0] = initialize_population(pop_size)
t = 0
while not stop_criterion(populations[t]):
fitnesses = evaluate(populations[t])
offspring = matting_and_variation(populations[t],
fitnesses)
populations[t+1] = environmental_selection(
populations[t],
offspring)
t = t+1
```

We will carry out our tests with a 30-cities problem.

```
In [30]:
```num_cities = 30
cities = generate_cities(num_cities)

`toolbox`

stored the setup of the algorithm. It describes the different elements to take into account.

```
In [31]:
```toolbox = base.Toolbox()

- Individuals represent possible solutions to the problem.
- In the TSP case, it looks like the tour itself can be a suitable representation.
- For simplicity, an individual can be a list with the indexes corresponding to each city.
- This will simplify the crossover and mutation operators.
- We can rely on the
`total_distance()`

function for evaluation and set the fitness assignment as to minimize it.

```
In [32]:
```creator.create("FitnessMin", base.Fitness, weights=(-1.0,))
creator.create("Individual", list, fitness=creator.FitnessMin)

`cities`

and, correspondingly, the population is composed by individuals.

```
In [33]:
```toolbox.register("indices", numpy.random.permutation, len(cities))
toolbox.register("individual", tools.initIterate, creator.Individual,
toolbox.indices)
toolbox.register("population", tools.initRepeat, list,
toolbox.individual)

Defining the crossover and mutation operators can be a challenging task.

There are various crossover operators that have been devised to deal with ordered individuals like ours.

- We will be using DEAP's
`deap.tools.cxOrdered()`

crossover. - For mutation we will swap elements from two points of the individual.
- This is performed by
`deap.tools.mutShuffleIndexes()`

.

```
In [34]:
```toolbox.register("mate", tools.cxOrdered)
toolbox.register("mutate", tools.mutShuffleIndexes, indpb=0.05)

Evaluation can be easily defined from the `total_distance()`

definition.

```
In [35]:
```def create_tour(individual):
return [list(cities)[e] for e in individual]

```
In [36]:
```def evaluation(individual):
'''Evaluates an individual by converting it into
a list of cities and passing that list to total_distance'''
return (total_distance(create_tour(individual)),)

```
In [37]:
```toolbox.register("evaluate", evaluation)

We will employ tournament selection with size 3.

```
In [38]:
```toolbox.register("select", tools.selTournament, tournsize=3)

Lets' run the algorithm with a population of 100 individuals and 400 generations.

```
In [39]:
```pop = toolbox.population(n=100)

```
In [40]:
```%%time
result, log = algorithms.eaSimple(pop, toolbox,
cxpb=0.8, mutpb=0.2,
ngen=400, verbose=False)

```
```

```
In [41]:
```best_individual = tools.selBest(result, k=1)[0]
print('Fitness of the best individual: ', evaluation(best_individual)[0])

```
```

```
In [42]:
```plot_tour(create_tour(best_individual))

```
```

It is interesting to assess how the fitness of the population changed as the evolution process took place.

We can prepare an `deap.tools.Statistics`

instance to specify what data to collect.

```
In [43]:
```fit_stats = tools.Statistics(key=operator.attrgetter("fitness.values"))
fit_stats.register('mean', numpy.mean)
fit_stats.register('min', numpy.min)

```
In [44]:
```result, log = algorithms.eaSimple(toolbox.population(n=100), toolbox,
cxpb=0.5, mutpb=0.2,
ngen=400, verbose=False,
stats=fit_stats)

```
In [45]:
```plt.figure(1, figsize=(11, 4), dpi=500)
plots = plt.plot(log.select('min'),'c-', log.select('mean'), 'b-', antialiased=True)
plt.legend(plots, ('Minimum fitness', 'Mean fitness'))
plt.ylabel('Fitness')
plt.xlabel('Iterations')

```
Out[45]:
```

`Statistics`

instance prepared for that.

```
In [46]:
```pop_stats = tools.Statistics(key=numpy.copy)
pop_stats.register('pop', numpy.copy) # -- copies the populations themselves
pop_stats.register('fitness', # -- computes and stores the fitnesses
lambda x : [evaluation(a) for a in x])

*Note*: I am aware that this could be done in a more efficient way.

```
In [47]:
```result, log = algorithms.eaSimple(toolbox.population(n=100), toolbox,
cxpb=0.5, mutpb=0.2,
ngen=400, verbose=False,
stats=pop_stats)

```
In [48]:
```def plot_population(record, min_fitness, max_fitness):
'''
Plots all individuals in a population.
Darker individuals have a better fitness.
'''
pop = record['pop']
fits = record['fitness']
index = sorted(range(len(fits)), key=lambda k: fits[k])
norm=colors.Normalize(vmin=min_fitness,
vmax=max_fitness)
sm = cmx.ScalarMappable(norm=norm,
cmap=plt.get_cmap('PuBu'))
for i in range(len(index)):
color = sm.to_rgba(max_fitness - fits[index[i]][0])
plot_tour(create_tour(pop[index[i]]), alpha=0.5, color=color)

```
In [49]:
```min_fitness = numpy.min(log.select('fitness'))
max_fitness = numpy.max(log.select('fitness'))

```
In [50]:
```plt.figure(1, figsize=(11,11), dpi=500)
for i in range(0, 12):
plt.subplot(4,3,i+1)
it = int(math.ceil((len(log)-1.)/15))
plt.title('t='+str(it*i))
plot_population(log[it*i], min_fitness, max_fitness)

```
```

```
In [51]:
```%timeit total_distance(greedy_TSP(cities))

```
```

```
In [52]:
```print('greedy_TSP() distance: ', total_distance(greedy_TSP(cities)))
print('Genetic algorithm best distance: ', evaluation(best_individual)[0])

```
```

The genetic algorithm outperformed the greedy approach at a *viable* computational cost.

*Note 1*:*Viable*depends on the particular problem, of course.*Note 2*: These results depend on the cities that were randomly generated. Your milleage may vary.

- We have just performed one run of the experiment, but genetic algorithms are stochastic algorithms and their performace should be assessed in statistical terms. Modify the genetic algorithm code in order to be able to report the comparison with
`greedy_TSP()`

in statistically sound terms. - Population size should have an impact on the performace of the algorithm. Make an experiment regarding that.
- What is the influence of the mutation and crossover probabilities in the performance of the genetic algorithm?

The population of the previous experiment can be better appreciated in animated form. We are going to use `matplotlib.animation`

and the JSAnimation library (you need to install it if you plan to run this notebook locally). Similarly, this functionality needs an HTML5 capable browser.

Part of this code has also been inspired by A Simple Animation: The Magic Triangle.

```
In [53]:
```from JSAnimation import IPython_display
from matplotlib import animation

```
In [54]:
```def update_plot_tour(plot, points, alpha=1, color='blue'):
'A function for updating a plot with an individual'
X, Y = XY(list(points) + [points[0]])
plot.set_data(X, Y)
plot.set_color(color)
return plot

```
In [55]:
```def init():
'Initialization of all plots to empty data'
for p in list(tour_plots):
p.set_data([], [])
return tour_plots

```
In [56]:
```def animate(i):
'Updates all plots to match frame _i_ of the animation'
pop = log[i]['pop']
fits = log[i]['fitness']
index = sorted(range(len(fits)), key=lambda k: fits[k])
norm=colors.Normalize(vmin=min_fitness,
vmax=max_fitness)
sm = cmx.ScalarMappable(norm=norm,
cmap=plt.get_cmap('PuBu'))
for j in range(len(tour_plots)):
color = sm.to_rgba(max_fitness - fits[index[j]][0])
update_plot_tour(tour_plots[j],
create_tour(pop[index[j]]),
alpha=0.5, color=color)
return tour_plots

```
In [ ]:
```fig = plt.figure()
ax = plt.axes(xlim=(0, 900), ylim=(0, 600))
tour_plots = [ax.plot([], [], 'bo-', alpha=0.1) for i in range(len(log[0]['pop']))]
tour_plots = [p[0] for p in tour_plots]
animation.FuncAnimation(fig, animate, init_func=init,
frames=200, interval=60, blit=True)

`.gif`

version of the animation for online viewing.

```
In [58]:
```anim = animation.FuncAnimation(fig, animate, init_func=init,
frames=200, interval=60, blit=True)
anim.save('tsp-populations.gif', writer='imagemagick')