Genetic Algorithms for Travelling Salesman Problem

By Diego Marinho de Oliveira on 12-08-15

The travelling salesman problem (TSP) is a well know problem in Computer Science and Math field. It consist in determine given a list of cities and distance between these cities what is the shortest path that starting from an arbitrary city we could visit all other cities only once and return to the initial city.

As it is a NP-hard problem in combinatorial optimization realm, plausible solutions make use of approximations. Thus, we could use Genetic Algorithm to search for a feasible solution in a finite amount of time since it is a search heristic that mimics the process of natural selection.

Then, in this short notebook, we will explore the use of GeneticAlgorithms package in Julia to find a fesiable solution to TSP.

1 Define the Problem

Lets supose that we are from the ZMData logistic company that works on California, US and want to distribute our new high-tech devices in five cities: 1. San Francisco, 2. Palo Alto, 3. Los Angeles, 4. San Diego and 5.Las Vegas. The cities, routes and costs is represented in Figure 1. We want to find a good route that save us money and time!

How we can do that using a good approximation? We could use genetic algorithms (GA)! Lets model our problem using one nice library on Julia called GeneticAlgorithms.

2 Understand GeneticAlgorithms Packages

2.1 Install Package

The first step is to install GeneticAlgorithms package. Just simple run these below commands at your Julia REPL (Read, Evaluate, Print and Loop) console:

Pkg.add("GeneticAlgorithms")
Pkg.update();

Then import the package in your notebook as follow:


In [1]:
using GeneticAlgorithms


WARNING: deprecated syntax "{a for a in b}" at /Users/diego/.julia/v0.4/GeneticAlgorithms/src/GeneticAlgorithms.jl:158.
Use "Any[a for a in b]" instead.

2.2 Model the Problem

The package can be used to solve any problem that we could model as an optimization problem For our case, our space problem is the all possible routes for the five cities. To model our specific problem we need to define:

  • (A) Our entity that represents an instance from the population
  • (B) Fitness function to calculate a score
  • (C) Comparison function to distinguish if your current solution is better than the previous one
  • (D) Grouping strategy to select groups over a population for crossover
  • (E) Crossover over the grouped population
  • (F) Mutation over the new instance from the population

2.2.1 Define an Entity

The first step in our modelling is defining a entity of the population. An entity in our problem is a route that has as attribute the edges that connects the cities. Then we can represent our entity named Route in Julia by


In [ ]:
type Route <: Entity
    edges::Array
    fitness

    Route() = new(get_random_path(N_VERTICES))
    Route(edges) = new(edges)
end

Lets understand your model. By default we must inherit the abstract GeneticAlgorithms.Entity from the GA package. Also, we need to set fitness as an internal attribute that can be accessed by the framework in other inherited methods from Entity. We also define 2 Constructors. One receives the edges and the another one generate a random path with a constant N_VERTICES that we will define later on the notebook.

2.3.2 Fitness and Comparison Function

The fitness function will be used by the framework and expect an entity and returns a score. For your case the fitness function will by


In [ ]:
function fitness(ent)
    sum(map(e -> COSTS[first(e), last(e)], ent.edges))
end

Also we need to define a comparison function to indicates what is the best solution from the previous and current one. The best score in our case is the one that has a lower score. Thus, we define as


In [ ]:
function GeneticAlgorithms.isless(lhs::Route, rhs::Route)
    abs(lhs.fitness) > abs(rhs.fitness)
end

2.3.3 Grouping Entities

Before we make the crossover on the entities we need to group them by some criteria. As in our example we don't have any specific restriction we define a naive one as following


In [ ]:
function group_entities(pop)
    if pop[1].fitness <= STOP_COST
        return
    end

    # simple naive groupings that pair the best entitiy with every other
    for i in 1:length(pop)
        produce([1, i])
    end
end

Also here we define a criteria to stop making groups and interrupt the process of searching for new solutions. Then, we define a constant named STOP_COST.

2.3.4 Define Crossover

Crossover is a basic operator in GA that combines instances from the population producing a new generation. Then, we propose the use of crossover order to preserve the order from the parents in the new generation. Also we limitated the crossover between the two instances from the group.


In [4]:
# CrossOver OX (Based on source: http://bit.ly/1XU0HWN)
function crossover(group)
    child = Route()

    num_parents = length(group)
    num_parents < 2 && return child

    vertices_1 = edges2vertices(group[1].edges)
    vertices_2 = edges2vertices(group[2].edges)
    
    n1, n2 = shuffle(collect(1:length(vertices_1)))[1:2]
    e1, e2 = min(n1, n2)[1], max(n1, n2)[1]

    fixed_vertices = vertices_1[e1:e2]
    vertices_1 = filter(v -> !(v in fixed_vertices), vertices_1)
    new_vertices = []
    
    count_fixed, count_1, count_2 = 1, 1, 1
    for i=1:length(vertices_2)
        if e1 <= i <= e2
            push!(new_vertices, fixed_vertices[count_fixed])
            count_fixed += 1
        else
            if !(vertices_2[count_2] in fixed_vertices) &&
               !(vertices_2[count_2] in new_vertices)
                push!(new_vertices, vertices_2[count_2])
                vertices_1 = filter(v-> v != vertices_2[count_2], vertices_1)
                count_2 += 1
            else
                push!(new_vertices, vertices_1[count_1])
                vertices_2 = filter(v-> v != vertices_1[count_1], vertices_2)
                count_1 += 1
            end
        end
    end
    
    child.edges = vertices2edges(new_vertices)
    return child
end


Out[4]:
crossover (generic function with 1 method)

2.3.5 Define Mutation

The mutation act directly on a single entity and is responsable to produce some changes by chance over a new generation. We define ours by


In [ ]:
function mutate(ent)
    # Mutate only in 1% of the time
    rand(Float64) < 0.99 && return

    vertices = edges2vertices(ent.edges)

    # mutate
    n1 = rand(collect(1:length(vertices)), 1)
    n2 = rand(collect(1:length(vertices)), 1)

    vertices[n1], vertices[n2] = vertices[n2], vertices[n1]
    ent.edges = vertices2edges(vertices)
end

Our mutations occurs only 1% of the times that occurs a crossover. We decided to swap random vertices from the initial route generated in crossover.

3 Glue All Together in a Module

After defined our six points of modelling over the framework we are ready to package our solution in a module named OurTSPMathModel. Then, the summary of our modelling above can be shown by


In [18]:
module OurTSPMathModel
using GeneticAlgorithms

# Constants w/ vertices number and cost of our routes
const N_VERTICES = 5
const COSTS = [
    0 22 41 44 50;
    22  0 40 22 31;
    41 40  0 22 42;
    44 22 22  0 22;
    50 31 42  22 0];

type Route <: Entity
    edges::Array
    fitness

    Route() = new(get_random_path(N_VERTICES))
    Route(edges) = new(edges)
end

edges2vertices(edges) = map(e -> first(e), edges)

function vertices2edges(vertices)
    edges = map(i -> (vertices[i], vertices[i+1]), vcat(1:length(vertices)-1)) 
    push!(edges, (edges[end][2], edges[1][1]))
end

function get_random_path(n)
    vertices = shuffle(collect(1:n))
    vertices2edges(vertices)
end

function create_entity(num)
    edges = get_random_path(N_VERTICES)
    Route(edges)
end

function fitness(ent)
    sum(map(e -> COSTS[first(e), last(e)], ent.edges))
end

function GeneticAlgorithms.isless(lhs::Route, rhs::Route)
    abs(lhs.fitness) > abs(rhs.fitness)
end

function group_entities(pop)
    if pop[1].fitness <= 150
        return
    end

    # simple naive groupings that pair the best entitiy with every other
    for i in 1:length(pop)
        produce([1, i])
    end
end

# CrossOver OX (Based on source: http://bit.ly/1XU0HWN)
function crossover(group)
    child = Route()

    num_parents = length(group)
    num_parents < 2 && return child

    vertices_1 = edges2vertices(group[1].edges)
    vertices_2 = edges2vertices(group[2].edges)
    
    n1, n2 = shuffle(collect(1:length(vertices_1)))[1:2]
    e1, e2 = min(n1, n2)[1], max(n1, n2)[1]

    fixed_vertices = vertices_1[e1:e2]
    vertices_1 = filter(v -> !(v in fixed_vertices), vertices_1)
    new_vertices = []
    
    count_fixed, count_1, count_2 = 1, 1, 1
    for i=1:length(vertices_2)
        if e1 <= i <= e2
            push!(new_vertices, fixed_vertices[count_fixed])
            count_fixed += 1
        else
            if !(vertices_2[count_2] in fixed_vertices) &&
               !(vertices_2[count_2] in new_vertices)
                push!(new_vertices, vertices_2[count_2])
                vertices_1 = filter(v-> v != vertices_2[count_2], vertices_1)
                count_2 += 1
            else
                push!(new_vertices, vertices_1[count_1])
                vertices_2 = filter(v-> v != vertices_1[count_1], vertices_2)
                count_1 += 1
            end
        end
    end
    
    child.edges = vertices2edges(new_vertices)
    return child
end

function mutate(ent)
    # Mutate only in 1% of the time
    rand(Float64) < 0.99 && return

    vertices = edges2vertices(ent.edges)

    # mutate
    n1 = rand(collect(1:length(vertices)), 1)
    n2 = rand(collect(1:length(vertices)), 1)

    vertices[n1], vertices[n2] = vertices[n2], vertices[n1]
    ent.edges = vertices2edges(vertices)
end
end


WARNING: replacing module OurTSPMathModel
Out[18]:
OurTSPMathModel

As a matter of simplification we added the costs and vertices number in the beginning of the module. Also we defined some auxiliar functions to better encapsulate responsabilities and facilitate the reuse of code. For example, edges2vertices, vertices2edges and get_random_path are functions that were not commented at section 2 but are of importance to the all GA algorithm works. They represents transformation from edges to vertices, vertices to edges and generate random routes respectively.

4 Run our GA Model

Is very easy to run our GA model. First we need to import the GeneticAlgorithms, create our model and call the population method.


In [13]:
using GeneticAlgorithms

In [25]:
model = runga(OurTSPMathModel; initial_pop_size = 8)
population(model)


WARNING: sizehint(A,n) is deprecated, use sizehint!(A,n) instead.
 in depwarn at deprecated.jl:73
 in sizehint at deprecated.jl:50
 in crossover_population at /Users/diego/.julia/v0.4/GeneticAlgorithms/src/GeneticAlgorithms.jl:150
 in runga at /Users/diego/.julia/v0.4/GeneticAlgorithms/src/GeneticAlgorithms.jl:110
 in runga at /Users/diego/.julia/v0.4/GeneticAlgorithms/src/GeneticAlgorithms.jl:89
 in include_string at loading.jl:266
 in execute_request_0x535c5df2 at /Users/diego/.julia/v0.4/IJulia/src/execute_request.jl:177
 in eventloop at /Users/diego/.julia/v0.4/IJulia/src/IJulia.jl:141
 in anonymous at task.jl:447
while loading In[25], in expression starting on line 1
WARNING: sizehint(A,n) is deprecated, use sizehint!(A,n) instead.
 in depwarn at deprecated.jl:73
 in sizehint at deprecated.jl:50
 in crossover_population at /Users/diego/.julia/v0.4/GeneticAlgorithms/src/GeneticAlgorithms.jl:153
 in runga at /Users/diego/.julia/v0.4/GeneticAlgorithms/src/GeneticAlgorithms.jl:110
 in runga at /Users/diego/.julia/v0.4/GeneticAlgorithms/src/GeneticAlgorithms.jl:89
 in include_string at loading.jl:266
 in execute_request_0x535c5df2 at /Users/diego/.julia/v0.4/IJulia/src/execute_request.jl:177
 in eventloop at /Users/diego/.julia/v0.4/IJulia/src/IJulia.jl:141
 in anonymous at task.jl:447
while loading In[25], in expression starting on line 1
Out[25]:
8-element Array{Any,1}:
 OurTSPMathModel.Route([(3,1),(1,2),(2,4),(4,5),(5,3)],149)
 OurTSPMathModel.Route([(5,1),(1,2),(2,4),(4,3),(3,5)],158)
 OurTSPMathModel.Route([(5,1),(1,2),(2,4),(4,3),(3,5)],158)
 OurTSPMathModel.Route([(5,1),(1,2),(2,4),(4,3),(3,5)],158)
 OurTSPMathModel.Route([(5,1),(1,2),(2,4),(4,3),(3,5)],158)
 OurTSPMathModel.Route([(5,1),(1,2),(2,4),(4,3),(3,5)],158)
 OurTSPMathModel.Route([(5,1),(1,2),(2,4),(4,3),(3,5)],158)
 OurTSPMathModel.Route([(5,1),(1,4),(4,2),(2,3),(3,5)],198)

From the above results we can see that it generates 8 outputs that corresponds each one from the initial instance from the initial population. By observation we see that each run produces differents results and this is due by the crossover and mutation characteristic of the algorithm. Also we can observes that for this time the best route is:

[(3,1),(1,2),(2,4),(4,5),(5,3)] with cost 149 and is shown by Figure 2.

In [ ]:

Supposing that the best solution is 138, we can calculate the gap between then by $\frac{Opt-He}{Opt} \times 100 = -7.98$. That is not bad! If we were more paciente we could run more trials and draw the best solution from the proposed solutions shown.

6 Final Considerations

This notebook was the aim of showing how to use GA algorithms in Julia. Also wanted to introduce new concepts and highlights how simple can be solve problems in Julia.

7 About the Author

Oliveira, Diego M. is a Ph.D. candidate in Applyed Mathematics and Modelling by Universidade Aberta de Lisboa and has a Master in Computer Science by Universidade Federal de Minas Gerais. He has passion with Data Science and has broad experience in many fields as Natural Language Processing, Recommendation Systems, Machine Learning, Software Engineering, Statistics and Mathematics. Publish almost daily posts about Machine Learning on Linkedin and PracticalLearning.io. Has ZMData as a consulting company in Machine Learning, Data Science and Big Data problems and works for clients around the globe (US, Europe, among others).