House Building with worker skills

This tutorial includes everything you need to set up decision optimization engines, build constraint programming models.

When you finish this tutorial, you'll have a foundational knowledge of Prescriptive Analytics.

This notebook is part of Prescriptive Analytics for Python

It requires either an installation of CPLEX Optimizers or it can be run on IBM Watson Studio Cloud (Sign up for a free IBM Cloud account and you can start using Watson Studio Cloud right away).

Table of contents:

Describe the business problem
How decision optimization (prescriptive analytics) can help
Use decision optimization
Summary

Describe the business problem

This is a problem of building five houses in different locations; the masonry, roofing, painting, etc. must be scheduled. Some tasks must necessarily take place before others and these requirements are expressed through precedence constraints.
There are three workers, and each worker has a given skill level for each task. Each task requires one worker; the worker assigned must have a non-null skill level for the task. A worker can be assigned to only one task at a time.
Each house has a deadline.
The objective is to maximize the skill levels of the workers assigned to the tasks.

How decision optimization can help

Prescriptive analytics technology recommends actions based on desired outcomes, taking into account specific scenarios, resources, and knowledge of past and current events. This insight can help your organization make better decisions and have greater control of business outcomes.
Prescriptive analytics is the next step on the path to insight-based actions. It creates value through synergy with predictive analytics, which analyzes data to predict future outcomes.
Prescriptive analytics takes that insight to the next level by suggesting the optimal way to handle that future situation. Organizations that can act fast in dynamic conditions and make superior decisions in uncertain environments gain a strong competitive advantage.
For example:
- Automate complex decisions and trade-offs to better manage limited resources.
- Take advantage of a future opportunity or mitigate a future risk.
- Proactively update recommendations based on changing events.
- Meet operational goals, increase customer loyalty, prevent threats and fraud, and optimize business processes.

About Detailed Scheduling concepts

Scheduling consists of assigning starting and completion times to a set of activities while satisfying different types of constraints (resource availability, precedence relationships, … ) and optimizing some criteria (minimizing tardiness, …)
Time is considered as a continuous dimension: domain of possible start/completion times for an activity is potentially very large
Beside start and completion times of activities, other types of decision variables are often involved in real industrial scheduling problems (resource allocation, optional activities …)

Use decision optimization

Step 1: Download the library

Run the following code to install Decision Optimization CPLEX Modeling library. The DOcplex library contains the two modeling packages, Mathematical Programming and Constraint Programming, referred to earlier.



In [ ]:

    
import sys
try:
    import docplex.cp
except:
    if hasattr(sys, 'real_prefix'):
        #we are in a virtual env.
        !pip install docplex
    else:
        !pip install --user docplex

Note that the more global package docplex contains another subpackage docplex.mp that is dedicated to Mathematical Programming, another branch of optimization.

Step 2: Set up the prescriptive engine

For display of the solution, ensure last version of matplotlib is available:



In [ ]:

    
try:
   import matplotlib
   if matplotlib.__version__ < "1.4.3":
        !pip install --upgrade matplotlib
except:
   !pip install --user matplotlib

Now, we need to import all required modeling functions that are provided by the docplex.cp package:



In [ ]:

    
from docplex.cp.model import CpoModel
from sys import stdout
from collections import namedtuple

Step 3: Model the data

Planning contains the number of houses and the max amount of periods (days) for our schedule



In [ ]:

    
NB_HOUSES = 5
MAX_AMOUNT_OF_PERIODS = 318
HOUSES = range(1, NB_HOUSES + 1)

All tasks must start and end between 0 and the max amount of periods



In [ ]:

    
period_domain = (0, MAX_AMOUNT_OF_PERIODS)

For each task type in the house building project, the following table shows the duration of the task in days along with the tasks that must be finished before the task can start. A worker can only work on one task at a time; each task, once started, may not be interrupted.

Task	Duration	Preceding tasks
masonry	35
carpentry	15	masonry
plumbing	40	masonry
ceiling	15	masonry
roofing	5	carpentry
painting	10	ceiling
windows	5	roofing
facade	10	roofing, plumbing
garden	5	roofing, plumbing
moving	5	windows, facade, garden, painting

Tasks' durations



In [ ]:

    
Task = (namedtuple("Task", ["name", "duration"]))
TASKS = {Task("masonry",   35),
         Task("carpentry", 15),
         Task("plumbing",  40),
         Task("ceiling",   15),
         Task("roofing",    5),
         Task("painting",  10),
         Task("windows",    5),
         Task("facade",    10),
         Task("garden",     5),
         Task("moving",     5),
        }

The tasks precedences



In [ ]:

    
TaskPrecedence = (namedtuple("TaskPrecedence", ["beforeTask", "afterTask"]))
TASK_PRECEDENCES = {TaskPrecedence("masonry",   "carpentry"),
                    TaskPrecedence("masonry",   "plumbing"),
                    TaskPrecedence("masonry",   "ceiling"),
                    TaskPrecedence("carpentry", "roofing"),
                    TaskPrecedence("ceiling",   "painting"),
                    TaskPrecedence("roofing",   "windows"),
                    TaskPrecedence("roofing",   "facade"),
                    TaskPrecedence("plumbing",  "facade"),
                    TaskPrecedence("roofing",   "garden"),
                    TaskPrecedence("plumbing",  "garden"),
                    TaskPrecedence("windows",   "moving"),
                    TaskPrecedence("facade",    "moving"),
                    TaskPrecedence("garden",    "moving"),
                    TaskPrecedence("painting",  "moving"),
                   }

There are three workers with varying skill levels in regard to the ten tasks. If a worker has a skill level of zero for a task, he may not be assigned to the task.

Task	Joe	Jack	Jim
masonry	9	5	0
carpentry	7	0	5
plumbing	0	7	0
ceiling	5	8	0
roofing	6	7	0
painting	0	9	6
windows	8	0	5
façade	5	5	0
garden	5	5	9
moving	6	0	8

Workers Names



In [ ]:

    
WORKERS = {"Joe", "Jack", "Jim"}

Workers Name and level for each of there skill



In [ ]:

    
Skill = (namedtuple("Skill", ["worker", "task", "level"]))
SKILLS = {Skill("Joe",  "masonry",   9),
          Skill("Joe",  "carpentry", 7),
          Skill("Joe",  "ceiling",   5),
          Skill("Joe",  "roofing",   6),
          Skill("Joe",  "windows",   8),
          Skill("Joe",  "facade",    5),
          Skill("Joe",  "garden",    5),
          Skill("Joe",  "moving",    6),
          Skill("Jack", "masonry",   5),
          Skill("Jack", "plumbing",  7),
          Skill("Jack", "ceiling",   8),
          Skill("Jack", "roofing",   7),
          Skill("Jack", "painting",  9),
          Skill("Jack", "facade",    5),
          Skill("Jack", "garden",    5),
          Skill("Jim",  "carpentry", 5),
          Skill("Jim",  "painting",  6),
          Skill("Jim",  "windows",   5),
          Skill("Jim",  "garden",    9),
          Skill("Jim",  "moving",    8)
          }

Utility functions

find_tasks: returns the task it refers to in the TASKS vector



In [ ]:

    
def find_tasks(name):
    return next(t for t in TASKS if t.name == name)

find_skills: returns the skill it refers to in the SKILLS vector



In [ ]:

    
def find_skills(worker, task):
    return next(s for s in SKILLS if (s.worker == worker) and (s.task == task))

find_max_level_skill: returns the tuple "skill" where the level is themaximum for a given task



In [ ]:

    
def find_max_level_skill(task):
    st = [s for s in SKILLS if s.task == task]
    return next(sk for sk in st if sk.level == max([s.level for s in st]))

Step 4: Set up the prescriptive model

Create the model container

The model is represented by a Python object that is filled with the different model elements (variables, constraints, objective function, etc). The first thing to do is then to create such an object:



In [ ]:

    
mdl = CpoModel(name="HouseBuilding")

Define the decision variables

Concept: interval variable

What for?

Modeling an interval of time during which a particular property holds
(an activity executes, a resource is idle, a tank must be non-empty, …)
Example:

interval_var(start=(0,1000), end=(0,1000), size=(10,20))
Properties:
- The **value** of an interval variable is an integer interval [start,end)
- **Domain** of possible values: [0,10), [1,11), [2,12),...[990,1000), [0,11),[1,12),...
- Domain of interval variables is represented **compactly** in CP Optimizer (a few bounds: smin, smax, emin, emax, szmin, szmax)

For each house, an interval variable is created for each task.
This interval must start and end inside the period_domain and its duration is set as the value stated in TASKS definition.



In [ ]:

    
tasks = {}   # dict of interval variable for each house and task
for house in HOUSES:
    for task in TASKS:
        tasks[(house, task)] = mdl.interval_var(start=period_domain,
                                                end=period_domain,
                                                size=task.duration,
                                                name="house {} task {}".format(house, task))

Concept: optional interval variable

Interval variables can be defined as being **optional** that is, it is part of the decisions of the problem to decide whether the interval will be **present** or **absent** in the solution
What for?

Modeling optional activities, alternative execution modes for activities, and … most of the discrete decisions in a schedule
Example:

interval_var(optional=True, start=(0,1000), end=(0,1000), size=(10,20))
Properties:
- An optional interval variable has an additional possible value in its domain (absence value)
- **Optionality** is a powerful property that you must learn to leverage in your models

For each house, an optional interval variable is created for each skill.
Skill being a tuple (worker, task, level), this means that for each house, an optional interval variable is created for each couple worker-task such that the skill level of this worker for this task is > 0.

The "set_optional()" specifier allows a choice between different variables, thus between different couples house-skill. This means that the engine decides if the interval will be present or absent in the solution.



In [ ]:

    
wtasks = {}  # dict of interval variable for each house and skill
for house in HOUSES:
    for skill in SKILLS:
        iv = mdl.interval_var(name='H' + str(house) + '-' + skill.task + '(' + skill.worker + ')')
        iv.set_optional()
        wtasks[(house, skill)] = iv

Express the business constraints

Temporal constraints

Concept: precedence constraint

What for?

Modeling temporal constraints between interval variables

Modeling constant or variable minimal delays </ul>

Properties

Semantic of the constraints handles optionality (as for all constraints in CP Optimizer).
Example of endBeforeStart:
end_before_start(a,b,z)
present(a) AND present(b) &Implies end(a)+z &LessSlantEqual start(b)

The tasks in the model have precedence constraints that are added to the model.



In [ ]:

    
for h in HOUSES:
    for p in TASK_PRECEDENCES:
        mdl.add(mdl.end_before_start(tasks[(h, find_tasks(p.beforeTask))], tasks[(h, find_tasks(p.afterTask))]))

Alternative workers

Concept: alternative constraint

What for?
- Modeling alternative resource/modes/recipes
- In general modeling a discrete selection in the schedule
Example:

alternative(a,[b1,...,bn])
Remark: Master interval variable **a** can of course be optional

To constrain the solution so that exactly one of the interval variables wtasks associated with a given task of a given house is to be present in the solution, an "alternative" constraint is used.



In [ ]:

    
for h in HOUSES:
    for t in TASKS:
        mdl.add(mdl.alternative(tasks[(h, t)], [wtasks[(h, s)] for s in SKILLS if (s.task == t.name)], 1))

No overlap constraint

Concept: No-overlap constraint

Constraint noOverlap schedules a group of interval variables in such a way that they do not overlap in time.

Absent interval variables are ignored.

It is possible to constrain minimum delays between intervals using transition matrix.

It is possible to constraint the first, last in the sequence or next or preceding interval </ul>

To add the constraints that a given worker can be assigned only one task at a given moment in time, a noOverlap constraint is used.



In [ ]:

    
for w in WORKERS:
    mdl.add(mdl.no_overlap([wtasks[(h, s)] for h in HOUSES for s in SKILLS if s.worker == w]))

Express the objective

The presence of an interval variable in wtasks in the solution must be accounted for in the objective. Thus for each of these possible tasks, the cost is incremented by the product of the skill level and the expression representing the presence of the interval variable in the solution.

The objective of this problem is to maximize the skill level used for all the tasks.



In [ ]:

    
obj = mdl.sum([s.level * mdl.presence_of(wtasks[(h, s)]) for s in SKILLS for h in HOUSES])
mdl.add(mdl.maximize(obj))

Solve the model

The model is now completely defined. It is time to solve it !



In [ ]:

    
# Solve the model
print("\nSolving model....")
msol = mdl.solve(TimeLimit=10)

Step 5: Investigate the solution and then run an example analysis



In [ ]:

    
print("Solve status: " + msol.get_solve_status())
if msol.is_solution():
    stdout.write("Solve time: " + str(msol.get_solve_time()) + "\n")
    # Sort tasks in increasing begin order
    ltasks = []
    for hs in HOUSES:
        for tsk in TASKS:
            (beg, end, dur) = msol[tasks[(hs, tsk)]]
            ltasks.append((hs, tsk, beg, end, dur))
    ltasks = sorted(ltasks, key = lambda x : x[2])
    # Print solution
    print("\nList of tasks in increasing start order:")
    for tsk in ltasks:
        print("From " + str(tsk[2]) + " to " + str(tsk[3]) + ", " + tsk[1].name + " in house " + str(tsk[0]))
else:
    stdout.write("No solution found\n")

Import graphical tools

You can set POP_UP_GRAPHIC=True if you prefer a pop up graphic window instead of an inline one.



In [ ]:

    
POP_UP_GRAPHIC=False



In [ ]:

    
import docplex.cp.utils_visu as visu
import matplotlib.pyplot as plt
if not POP_UP_GRAPHIC:
    %matplotlib inline
#Change the plot size
from pylab import rcParams
rcParams['figure.figsize'] = 15, 3

Draw solution

Useful functions

With the aim to facilitate the display of tasks names, we keep only the n first characters.



In [ ]:

    
def compact_name(name,n): return name[:n]



In [ ]:

    
if msol and visu.is_visu_enabled():
    workers_colors = {}
    workers_colors["Joe"] = 'lightblue'
    workers_colors["Jack"] = 'violet'
    workers_colors["Jim"] = 'lightgreen'
    visu.timeline('Solution per houses', 0, MAX_AMOUNT_OF_PERIODS)
    for h in HOUSES:
        visu.sequence(name="house " + str(h))
        for s in SKILLS:
            wt = msol.get_var_solution(wtasks[(h,s)])
            if wt.is_present():
                color = workers_colors[s.worker]
                wtname = compact_name(s.task,2)
                visu.interval(wt, color, wtname)
    visu.show()

The purpose of this function is to compact the names of the different tasks with the aim of making the graphical display readable. </p> For example "H3-garden" becomes "G3"



In [ ]:

    
def compact_house_task(name):
    loc, task = name[1:].split('-', 1)
    return task[0].upper() + loc

Green-like color when task is using the most skilled worker Red-like color when task does not use the most skilled worker



In [ ]:

    
if msol and visu.is_visu_enabled():
    visu.timeline('Solution per workers', 0, MAX_AMOUNT_OF_PERIODS)
    for w in WORKERS:
        visu.sequence(name=w)
        for h in HOUSES:
            for s in SKILLS:
                if s.worker == w:
                    wt = msol.get_var_solution(wtasks[(h,s)])
                    if wt.is_present():
                        ml = find_max_level_skill(s.task).level
                        if s.level == ml:
                            color = 'lightgreen'
                        else:
                            color = 'salmon'
                        wtname = compact_house_task(wt.get_name())
                        visu.interval(wt, color, wtname)
    visu.show()

Going further with Constraint Programming

The last available installable package is available on Pypi here: https://pypi.python.org/pypi/docplex

A complete set of modeling examples can be downloaded here: https://github.com/IBMDecisionOptimization/docplex-examples

Summary

You learned how to set up and use the IBM Decision Optimization CPLEX Modeling for Python to build and solve a Constraint Programming model.

References

CPLEX Modeling for Python documentation
Decision Optimization on Cloud
Need help with DOcplex or to report a bug? Please go here
Contact us at dofeedback@wwpdl.vnet.ibm.com