This tutorial includes everything you need to set up decision optimization engines, build constraint programming models.
When you finish this tutorial, you'll have a foundational knowledge of Prescriptive Analytics.
This notebook is part of Prescriptive Analytics for Python
It requires either an installation of CPLEX Optimizers or it can be run on IBM Watson Studio Cloud (Sign up for a free IBM Cloud account and you can start using Watson Studio Cloud right away).
Table of contents:
This is a problem of building five houses in different locations; the masonry, roofing, painting, etc. must be scheduled. Some tasks must necessarily take place before others and these requirements are expressed through precedence constraints.
There are three workers, and each worker has a given skill level for each task. Each task requires one worker; the worker assigned must have a non-null skill level for the task. A worker can be assigned to only one task at a time.
Each house has a deadline.
The objective is to maximize the skill levels of the workers assigned to the tasks.
Prescriptive analytics technology recommends actions based on desired outcomes, taking into account specific scenarios, resources, and knowledge of past and current events. This insight can help your organization make better decisions and have greater control of business outcomes.
Prescriptive analytics is the next step on the path to insight-based actions. It creates value through synergy with predictive analytics, which analyzes data to predict future outcomes.
Prescriptive analytics takes that insight to the next level by suggesting the optimal way to handle that future situation. Organizations that can act fast in dynamic conditions and make superior decisions in uncertain environments gain a strong competitive advantage.
For example:
In [ ]:
import sys
try:
import docplex.cp
except:
if hasattr(sys, 'real_prefix'):
#we are in a virtual env.
!pip install docplex
else:
!pip install --user docplex
Note that the more global package docplex contains another subpackage docplex.mp that is dedicated to Mathematical Programming, another branch of optimization.
For display of the solution, ensure last version of matplotlib is available:
In [ ]:
try:
import matplotlib
if matplotlib.__version__ < "1.4.3":
!pip install --upgrade matplotlib
except:
!pip install --user matplotlib
Now, we need to import all required modeling functions that are provided by the docplex.cp package:
In [ ]:
from docplex.cp.model import CpoModel
from sys import stdout
from collections import namedtuple
Planning contains the number of houses and the max amount of periods (days) for our schedule
In [ ]:
NB_HOUSES = 5
MAX_AMOUNT_OF_PERIODS = 318
HOUSES = range(1, NB_HOUSES + 1)
All tasks must start and end between 0 and the max amount of periods
In [ ]:
period_domain = (0, MAX_AMOUNT_OF_PERIODS)
For each task type in the house building project, the following table shows the duration of the task in days along with the tasks that must be finished before the task can start. A worker can only work on one task at a time; each task, once started, may not be interrupted.
Task | Duration | Preceding tasks |
---|---|---|
masonry | 35 | |
carpentry | 15 | masonry |
plumbing | 40 | masonry |
ceiling | 15 | masonry |
roofing | 5 | carpentry |
painting | 10 | ceiling |
windows | 5 | roofing |
facade | 10 | roofing, plumbing |
garden | 5 | roofing, plumbing |
moving | 5 | windows, facade, garden, painting |
In [ ]:
Task = (namedtuple("Task", ["name", "duration"]))
TASKS = {Task("masonry", 35),
Task("carpentry", 15),
Task("plumbing", 40),
Task("ceiling", 15),
Task("roofing", 5),
Task("painting", 10),
Task("windows", 5),
Task("facade", 10),
Task("garden", 5),
Task("moving", 5),
}
In [ ]:
TaskPrecedence = (namedtuple("TaskPrecedence", ["beforeTask", "afterTask"]))
TASK_PRECEDENCES = {TaskPrecedence("masonry", "carpentry"),
TaskPrecedence("masonry", "plumbing"),
TaskPrecedence("masonry", "ceiling"),
TaskPrecedence("carpentry", "roofing"),
TaskPrecedence("ceiling", "painting"),
TaskPrecedence("roofing", "windows"),
TaskPrecedence("roofing", "facade"),
TaskPrecedence("plumbing", "facade"),
TaskPrecedence("roofing", "garden"),
TaskPrecedence("plumbing", "garden"),
TaskPrecedence("windows", "moving"),
TaskPrecedence("facade", "moving"),
TaskPrecedence("garden", "moving"),
TaskPrecedence("painting", "moving"),
}
There are three workers with varying skill levels in regard to the ten tasks. If a worker has a skill level of zero for a task, he may not be assigned to the task.
Task | Joe | Jack | Jim |
---|---|---|---|
masonry | 9 | 5 | 0 |
carpentry | 7 | 0 | 5 |
plumbing | 0 | 7 | 0 |
ceiling | 5 | 8 | 0 |
roofing | 6 | 7 | 0 |
painting | 0 | 9 | 6 |
windows | 8 | 0 | 5 |
façade | 5 | 5 | 0 |
garden | 5 | 5 | 9 |
moving | 6 | 0 | 8 |
In [ ]:
WORKERS = {"Joe", "Jack", "Jim"}
In [ ]:
Skill = (namedtuple("Skill", ["worker", "task", "level"]))
SKILLS = {Skill("Joe", "masonry", 9),
Skill("Joe", "carpentry", 7),
Skill("Joe", "ceiling", 5),
Skill("Joe", "roofing", 6),
Skill("Joe", "windows", 8),
Skill("Joe", "facade", 5),
Skill("Joe", "garden", 5),
Skill("Joe", "moving", 6),
Skill("Jack", "masonry", 5),
Skill("Jack", "plumbing", 7),
Skill("Jack", "ceiling", 8),
Skill("Jack", "roofing", 7),
Skill("Jack", "painting", 9),
Skill("Jack", "facade", 5),
Skill("Jack", "garden", 5),
Skill("Jim", "carpentry", 5),
Skill("Jim", "painting", 6),
Skill("Jim", "windows", 5),
Skill("Jim", "garden", 9),
Skill("Jim", "moving", 8)
}
find_tasks: returns the task it refers to in the TASKS vector
In [ ]:
def find_tasks(name):
return next(t for t in TASKS if t.name == name)
find_skills: returns the skill it refers to in the SKILLS vector
In [ ]:
def find_skills(worker, task):
return next(s for s in SKILLS if (s.worker == worker) and (s.task == task))
find_max_level_skill: returns the tuple "skill" where the level is themaximum for a given task
In [ ]:
def find_max_level_skill(task):
st = [s for s in SKILLS if s.task == task]
return next(sk for sk in st if sk.level == max([s.level for s in st]))
The model is represented by a Python object that is filled with the different model elements (variables, constraints, objective function, etc). The first thing to do is then to create such an object:
In [ ]:
mdl = CpoModel(name="HouseBuilding")
Modeling an interval of time during which a particular property holds
(an activity executes, a resource is idle, a tank must be non-empty, …)
interval_var(start=(0,1000), end=(0,1000), size=(10,20))
For each house, an interval variable is created for each task.
This interval must start and end inside the period_domain and its duration is set as the value stated in TASKS definition.
In [ ]:
tasks = {} # dict of interval variable for each house and task
for house in HOUSES:
for task in TASKS:
tasks[(house, task)] = mdl.interval_var(start=period_domain,
end=period_domain,
size=task.duration,
name="house {} task {}".format(house, task))
Modeling optional activities, alternative execution modes for activities, and … most of the discrete decisions in a schedule
interval_var(optional=True, start=(0,1000), end=(0,1000), size=(10,20))
For each house, an optional interval variable is created for each skill.
Skill being a tuple (worker, task, level), this means that for each house, an optional interval variable is created for each couple worker-task such that the skill level of this worker for this task is > 0.
The "set_optional()" specifier allows a choice between different variables, thus between different couples house-skill. This means that the engine decides if the interval will be present or absent in the solution.
In [ ]:
wtasks = {} # dict of interval variable for each house and skill
for house in HOUSES:
for skill in SKILLS:
iv = mdl.interval_var(name='H' + str(house) + '-' + skill.task + '(' + skill.worker + ')')
iv.set_optional()
wtasks[(house, skill)] = iv
Semantic of the constraints handles optionality (as for all constraints in CP Optimizer).
Example of endBeforeStart:
end_before_start(a,b,z)
present(a) AND present(b) &Implies end(a)+z &LessSlantEqual start(b)
The tasks in the model have precedence constraints that are added to the model.
In [ ]:
for h in HOUSES:
for p in TASK_PRECEDENCES:
mdl.add(mdl.end_before_start(tasks[(h, find_tasks(p.beforeTask))], tasks[(h, find_tasks(p.afterTask))]))
alternative(a,[b1,...,bn])
To constrain the solution so that exactly one of the interval variables wtasks associated with a given task of a given house is to be present in the solution, an "alternative" constraint is used.
In [ ]:
for h in HOUSES:
for t in TASKS:
mdl.add(mdl.alternative(tasks[(h, t)], [wtasks[(h, s)] for s in SKILLS if (s.task == t.name)], 1))
To add the constraints that a given worker can be assigned only one task at a given moment in time, a noOverlap constraint is used.
In [ ]:
for w in WORKERS:
mdl.add(mdl.no_overlap([wtasks[(h, s)] for h in HOUSES for s in SKILLS if s.worker == w]))
The presence of an interval variable in wtasks in the solution must be accounted for in the objective. Thus for each of these possible tasks, the cost is incremented by the product of the skill level and the expression representing the presence of the interval variable in the solution.
The objective of this problem is to maximize the skill level used for all the tasks.
In [ ]:
obj = mdl.sum([s.level * mdl.presence_of(wtasks[(h, s)]) for s in SKILLS for h in HOUSES])
mdl.add(mdl.maximize(obj))
In [ ]:
# Solve the model
print("\nSolving model....")
msol = mdl.solve(TimeLimit=10)
In [ ]:
print("Solve status: " + msol.get_solve_status())
if msol.is_solution():
stdout.write("Solve time: " + str(msol.get_solve_time()) + "\n")
# Sort tasks in increasing begin order
ltasks = []
for hs in HOUSES:
for tsk in TASKS:
(beg, end, dur) = msol[tasks[(hs, tsk)]]
ltasks.append((hs, tsk, beg, end, dur))
ltasks = sorted(ltasks, key = lambda x : x[2])
# Print solution
print("\nList of tasks in increasing start order:")
for tsk in ltasks:
print("From " + str(tsk[2]) + " to " + str(tsk[3]) + ", " + tsk[1].name + " in house " + str(tsk[0]))
else:
stdout.write("No solution found\n")
You can set POP_UP_GRAPHIC=True if you prefer a pop up graphic window instead of an inline one.
In [ ]:
POP_UP_GRAPHIC=False
In [ ]:
import docplex.cp.utils_visu as visu
import matplotlib.pyplot as plt
if not POP_UP_GRAPHIC:
%matplotlib inline
#Change the plot size
from pylab import rcParams
rcParams['figure.figsize'] = 15, 3
With the aim to facilitate the display of tasks names, we keep only the n first characters.
In [ ]:
def compact_name(name,n): return name[:n]
In [ ]:
if msol and visu.is_visu_enabled():
workers_colors = {}
workers_colors["Joe"] = 'lightblue'
workers_colors["Jack"] = 'violet'
workers_colors["Jim"] = 'lightgreen'
visu.timeline('Solution per houses', 0, MAX_AMOUNT_OF_PERIODS)
for h in HOUSES:
visu.sequence(name="house " + str(h))
for s in SKILLS:
wt = msol.get_var_solution(wtasks[(h,s)])
if wt.is_present():
color = workers_colors[s.worker]
wtname = compact_name(s.task,2)
visu.interval(wt, color, wtname)
visu.show()
The purpose of this function is to compact the names of the different tasks with the aim of making the graphical display readable. </p> For example "H3-garden" becomes "G3"
In [ ]:
def compact_house_task(name):
loc, task = name[1:].split('-', 1)
return task[0].upper() + loc
Green-like color when task is using the most skilled worker Red-like color when task does not use the most skilled worker
In [ ]:
if msol and visu.is_visu_enabled():
visu.timeline('Solution per workers', 0, MAX_AMOUNT_OF_PERIODS)
for w in WORKERS:
visu.sequence(name=w)
for h in HOUSES:
for s in SKILLS:
if s.worker == w:
wt = msol.get_var_solution(wtasks[(h,s)])
if wt.is_present():
ml = find_max_level_skill(s.task).level
if s.level == ml:
color = 'lightgreen'
else:
color = 'salmon'
wtname = compact_house_task(wt.get_name())
visu.interval(wt, color, wtname)
visu.show()
The last available installable package is available on Pypi here: https://pypi.python.org/pypi/docplex
A complete set of modeling examples can be downloaded here: https://github.com/IBMDecisionOptimization/docplex-examples
Copyright © 2017, 2018 IBM. IPLA licensed Sample Materials.