1st release of this notebook, still in progress though. If you have any suggestion to improve it, please let me know or PR it on github !
At LOCIE Lab, the project Réha-Parcs aims to apply multi-objective optimization on building stock. Bibliography tended to pick Genetic Algorithm to optimize. In the field of building physics, the algorithm NSGA2 is widely used because it is know as stable and results are good in term of convergence and diversity.
A good implementation of NSGA2 is available in the library DEAP. This library makes possible to do complex many objective optimization and to use a lot of different algorithm pretty easily.
The aim of this notebook is to provide PhD student at LOCIE with a cheatsheet to launch quickly an optimization with this algorithm with some explanation on the code. It is freely based on the example provided by the developpers of DEAP for the algorithm NSGA2 since their code is pretty short and efficient. I tried to provide more explanations and my small experience on this example here.
Some prior knowledge about how NSGA-II is working is needed as this notebook is focusing on its implementation.
In [ ]:
import random
import datetime
import multiprocessing
import numpy as np
from deap import base
from deap import creator
from deap import tools
### If your evaluation function is external...
# import YourEvaluationFunction as evaluation
Let's import stuff...
Here 3 main components of DEAP are imported:
base
contains all of the base classes and the toolbox generatorcreator
enable to generate objects like individuals or populationtools
contains built-in function related to optimisation such as selection tournament for the best individuals or crossover and mutation operatorsWe will need constant as parameter of the algorithm:
In [ ]:
NGEN = 50 # Number of Generation
MU = 100 # Number of individual in population
CXPB = 0.8 #Crossover probability
NDIM = 4 # Number of dimension of the individual (=number of gene)
# Bounds on the first 3 genes
LOW1, UP1 = 0, 28
# Bounds on the last gene
LOW2, UP2 = 0, 5
BOUNDS = [(LOW1, UP1) for i in range(NDIM-1)] + [(LOW2, UP2)]
In [ ]:
toolbox = base.Toolbox()
def init_opti():
creator.create("FitnessMin", base.Fitness, weights=(-1.0, -1.0, -1.0))
creator.create("Individual", list, typecode='d', fitness=creator.FitnessMin)
toolbox.register("individual", init_ind, icls=creator.Individual, ranges=BOUNDS)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
toolbox.register("evaluate", evaluation)
toolbox.register("mate", tools.cxOnePoint)
toolbox.register("mutate", tools.mutUniformInt, low=[x[0] for x in BOUNDS],
up=[x[1] for x in BOUNDS], indpb=0.1 / NDIM)
toolbox.register("select", tools.selNSGA2)
DEAP is based on toolbox: each toolbox is a component of the algorithm. Here 2 creators
are used and the first one is important because it's creating the attribute fitness for each individual. Moreover, the tuple weights
has as many element as the evaluation function has objectives. Here, it means that we will minimize (because weights
are negative) 3 objectives. Should you wish to had an objective, you must had a term to this tuple otherwise it won't be taken into account (it happened to me !)
Afterwards in the code are created toolboxes that choose the characteristic of the optimization such as the definition of the individual, the evaluation function, mutation and crossover operators and selection operator.
NSGA2 is used for the selection tournament described by Deb. I used a classic crossing operator in one point in this example. THe mutation operator has the bounds for each gene of the individual as parameters, otherwise it can mutate out of bounds and create an error in the evaluation function.
In [ ]:
def init_ind(icls, ranges):
genome = list()
for p in ranges:
genome.append(np.random.randint(*p))
return icls(genome)
You will need to initialize the population. The individual toolbox features an individual creator that transforms a list into an individual object: it is this icls function. The initialization can be done in various way depending on what is needing in input for the evaluation function. Here it need integers in the bounds predefined earlier.
In [ ]:
def main():
pareto = tools.ParetoFront()
pop = toolbox.population(n=MU)
graph = []
data = []
# Evaluate the individuals with an invalid fitness
invalid_ind = [ind for ind in pop if not ind.fitness.valid]
fitnesses = toolbox.map(toolbox.evaluate, invalid_ind)
data.append(fitnesses)
for ind, fit in zip(invalid_ind, fitnesses):
ind.fitness.values = fit
graph.append(ind.fitness.values)
# This is just to assign the crowding distance to the individuals
# no actual selection is done
pop = toolbox.select(pop, len(pop))
# Begin the generational process
for gen in range(1, NGEN):
# Vary the population
offspring = tools.selTournamentDCD(pop, len(pop))
offspring = [toolbox.clone(ind) for ind in offspring]
for ind1, ind2 in zip(offspring[::2], offspring[1::2]):
if random.random() <= CXPB:
toolbox.mate(ind1, ind2)
toolbox.mutate(ind1)
toolbox.mutate(ind2)
del ind1.fitness.values, ind2.fitness.values
# Evaluate the individuals with an invalid fitness
invalid_ind = [ind for ind in offspring if not ind.fitness.valid]
fitnesses = toolbox.map(toolbox.evaluate, invalid_ind)
data.append(fitnesses)
for ind, fit in zip(invalid_ind, fitnesses):
ind.fitness.values = fit
graph.append(ind.fitness.values)
# Select the next generation population
pop = toolbox.select(pop + offspring, MU)
pareto.update(pop)
return pop, pareto, graph, data
A lot of code lines here, but not much really hard to understand.
On the first line, a Pareto object is created to store individuals that belong to the first order front. Then 6 lines of code enable to check wether the fitness associated with each individual is valid and if not to evaluate the individual and attribute the fitness.
Then the algorithm selects each pair of individual in the order and mutate them and do some crossover after comparing them and selecting the best one.
The code is not that DEAP specific and is self-sufficient for the rest of the process.
In [ ]:
def evaluation(ind):
objective1 = random.randint(10,1000)
objective2 = random.randint(10,50)
objective3 = random.randint(200,500)
return objective1, objective2, objective3
Please replace that with the evaluation function you want !
In Réha-Parcs project, we use a thermal simulation function to get the energy demand of a building and a comfort criteria claculated by the simulation.
I did set it to do multiprocessing for the example, mainly to show that it was integrated in a toolbox in DEAP: thanks to the developers for making it so easy ! Be careful to start your pool after the creation of your creator
objects, otherwise they won't be taken into account in the pool.
For more on the multiprocessing in Python, please refer to this notebook: Unlock the power of your computer with multiprocessing computation
In [ ]:
if __name__ == "__main__":
init_opti()
# Multiprocessing pool: I want to compute faster
pool = multiprocessing.Pool()
toolbox.register("map", pool.map)
pop, optimal_front, graph_data, data = main()
# Saving the Pareto Front, for further exploitation
with open('./pareto_front.txt', 'w') as front:
for ind in optimal_front:
front.write(str(ind.fitness) + '\n')
If you would like to plot your pareto front it is possible as well pretty easily with the datas provided.
In [ ]:
import matplotlib.pyplot as plt
x, y, z = zip(*[ind.fitness.values for ind in optimal_front])
fig = plt.figure()
fig.set_size_inches(15,10)
axe = plt.subplot2grid((2,2),(0,0))
axe.set_ylabel('Objective 2', fontsize=15)
axe.scatter(x, y, c='b', marker='+')
axe = plt.subplot2grid((2,2),(1,0))
axe.set_ylabel('Objective 3', fontsize=15)
axe.set_xlabel('Objective 1', fontsize=15)
axe.scatter(x, z, c='b', marker='+')
axe = plt.subplot2grid((2,2),(1,1))
axe.set_xlabel('Objective 2', fontsize=15)
scat = axe.scatter(y, z, c='b', marker='+')
plt.show()
This is made to launch the optimization and write down the pareto front for further exploitation. There is still other variables that are not exploited: it actually depends on what is the plan after the application of multi-objective optimization.
This is still in progress but here is a quick way to launch and understand a bit better how the library DEAP works. Do ask for further details as I would like to update this notebook with technical points that are a bit tricky to understand.
And if you are trying to optimize building retrofit, please check on the Reha-Parcs project: we are trying to do the same here !
Deb, Kalyanmoy, Amrit Pratap, Sameer Agarwal, and T. Meyarivan. 2002. “A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II.” IEEE Transactions on Evolutionary Computation 6 (2): 182–97. https://doi.org/10.1109/4235.996017.
Fortin, Félix-Antoine, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. “DEAP : Evolutionary Algorithms Made Easy.” Journal of Machine Learning Research 13: 2171–75. https://doi.org/10.1.1.413.6512.