Problem Statement Family members from different places in the country wish to meet up in New York. They will arrive on the same day and leave on the same day, and they would like to share transportation to and from the airport. There are dozens of flights per day to NY from any of the family members locations, all leaving at different times. The flights also vary in price and duration.
Example taken from: "Building Smart Web 2.0 Applications: Programming Collective Intelligence" by Toby Segran
In [1]:
import collections
import time
import random
import math
import matplotlib.pyplot as plt
%matplotlib inline
In [2]:
# Family members and their point of origin
people = [
('Seymour', 'BOS'),
('Franny', 'DAL'),
('Zooey', 'CAK'),
('Walt', 'MIA'),
('Buddy', 'ORD'),
('Les', 'OMA'),
]
# Destination
destination = 'LGA' # LaGuardia (New York)
In [3]:
# Get Flight information
def parseCsv(lines):
flights = collections.defaultdict(list)
for line in lines:
origin, dest, depart, arrive, price = line.strip().split(',')
#Add details to the list of possible flights
flights[(origin, dest)].append( (depart, arrive, int(price)) )
return flights
In [4]:
# Information about possible flights that can be taken
# "schedule.txt" contains, comma-separated, information about origin, destination, departure time, arrival time, and price.
flights = parseCsv(open('schedule.txt'))
In [5]:
# Utility function to calculate how many minutes, into a day, a time is.
# This will be used to calculate flight times and waiting times
def getminutes(t):
x = time.strptime(t,'%H:%M')
return x[3]*60+x[4]
In [6]:
# This function takes the solution (list of numbers) and prints out a schedule that is easier to interpret.
# This function prints each persons name and origin, as well as the depature time, arrival time, and
# price for outgoing and return flights.
def printschedule(r,dest):
for d in range(len(r)/2):
name = people[d][0]
origin = people[d][1]
out = flights[(origin, dest)][r[d]]
ret = flights[(origin, dest)][r[d + 1]]
print '%10s%10s %5s-%5s $%3s %5s-%5s $%3s' % (
name, origin, out[0], out[1], out[2], ret[0], ret[1], ret[2])
We will be representing the solution as a list of numbers: The number represents a flight a person chooses to take (e.g., 0=first flight of day, 1=second flight of day, etc.).
Note: Since each person needs to take an inbound and outbound flight, the lenegth of the list is twice the number of people.
In [7]:
# Example
s = [1,4,3,2,7,3,6,3,2,4,5,3]
printschedule(s,'LGA')
In [8]:
# This function takes into account the total trip and the total time spent waiting at the
# airport for the various members of the family.
#
# The logic is that total wait time assumes that all the family members will leave the airport
# together when the last person arrives, and will all go to the airport for the earliest
# departure.
#
# The goal is to minimize the cost by choosing the correct set of numbers.
def costfunc(flights, dest):
def schedulecost(sol):
totalprice = 0
latestarrival = 0
earliestdep = 24*60
for d in range(len(sol) / 2):
origin = people[d][1]
outbound = flights[(origin, dest)][int(sol[d])]
returnf = flights[(dest, origin)][int(sol[d+1])]
totalprice += outbound[2] + returnf[2]
# Track the latest arrival and earliest departure
latestarrival = max(latestarrival, getminutes(outbound[1]))
earliestdep = min(latestarrival, getminutes(returnf[0]))
# Every person must wait until the last person arrives.
# They must also arrive when the first flight leaves
totalwait = 0
for d in range(len(sol)/2):
origin = people[d][1]
outbound = flights[(origin, dest)][int(sol[d])]
returnf = flights[(dest, origin)][int(sol[d+1])]
totalwait += latestarrival - getminutes(outbound[1])
totalwait += getminutes(returnf[0]) - earliestdep
# One additional day of car rental fees?
if latestarrival >= earliestdep: totalprice += 50
return totalprice + totalwait
return schedulecost
In [9]:
def geneticoptimize(population,costf, popsize=50, step=1,
mutprob=0.2, elite=0.2, maxiter=100):
# Make a small, simple random change to an existing solution
def mutate(vec):
# Get a random integer
i = random.randint(0, len(population)-1)
if random.random() < 0.5:
# Decreasing value
return vec[0:i] + [vec[i] - step] + vec[i+1:]
else:
# Increasing value
return vec[0:i] + [vec[i] + step] + vec[i+1:]
return vec
def crossover(r1, r2):
i = random.randint(1, len(population)-2)
return r1[0:i] + r2[i:]
# Starting population
pop = [[random.randint(population[j][0],population[j][1])
for j in range(len(population))]
for i in range(popsize)]
# This is the number of winners we want from each generation
topelite = int(elite*popsize)
# Variable to hold cost history
costhistory = [None]*maxiter
for i in range(maxiter):
# Sort the population by cost
ranked = sorted(pop, key=costf)
# add winners
pop = ranked[0:topelite]
# add mutated and bred forms of the winners
while len(pop) < popsize:
if random.random() < mutprob:
# Randomly pick a winner and mutate
newIndividual = mutate(ranked[random.randint(0, topelite-1)])
else:
# Randomly pick 2 winners and perform a cross-over
newIndividual = crossover(ranked[random.randint(0, topelite-1)],ranked[random.randint(0, topelite-1)])
pop.append(newIndividual)
# Print current best score
costhistory[i] = costf(ranked[0])
plt.plot(costhistory)
plt.xlabel('Iteration Number')
plt.ylabel('Cost')
# Return current best guy
return ranked[0]# Random solutions
In [10]:
# Cost function that will be used to find "best" solution
f = costfunc(flights, destination)
# Initial Population
population = [(0, 8)] * (len(people) * 2)
# Run genetic algorithm
r = geneticoptimize(population, f)
# Print final schedule and cost
printschedule(r, destination)
print 'Final Cost = ' + str(f(r))
In [10]:
In [10]: