This tutorial includes everything you need to set up decision optimization engines, build constraint programming models.
When you finish this tutorial, you'll have a foundational knowledge of Prescriptive Analytics.
This notebook is part of Prescriptive Analytics for Python
It requires either an installation of CPLEX Optimizers or it can be run on IBM Watson Studio Cloud (Sign up for a free IBM Cloud account and you can start using Watson Studio Cloud right away).
Table of contents:
A detailed description (from which this paragraph comes from) is available on Wikipedia at https://en.wikipedia.org/wiki/Golomb_ruler.
In mathematics, a Golomb ruler is a set of marks at integer positions along an imaginary ruler such that no two pairs of marks are the same distance apart. The number of marks on the ruler is its order, and the largest distance between two of its marks is its length.
Following is an example of Golomb ruler of order 4 and length 6.
This problem is not only an intellectual problem. It has a lot of practical applications:
Prescriptive analytics technology recommends actions based on desired outcomes, taking into account specific scenarios, resources, and knowledge of past and current events. This insight can help your organization make better decisions and have greater control of business outcomes.
Prescriptive analytics is the next step on the path to insight-based actions. It creates value through synergy with predictive analytics, which analyzes data to predict future outcomes.
Prescriptive analytics takes that insight to the next level by suggesting the optimal way to handle that future situation. Organizations that can act fast in dynamic conditions and make superior decisions in uncertain environments gain a strong competitive advantage.
For example:
Constraint Programming is a programming paradigm that allows to express a problem using:
All this information, plus some configuration parameters, is aggregated into a single object called model.
The remainder of this notebook describes in details how to build and solve this problem with IBM CP Optimizer, using its DOcplex Python modeling API.
In [ ]:
import sys
try:
import docplex.cp
except:
if hasattr(sys, 'real_prefix'):
#we are in a virtual env.
!pip install docplex
else:
!pip install --user docplex
Note that the more global package docplex contains another subpackage docplex.mp that is dedicated to Mathematical Programming, another branch of optimization.
In [ ]:
# Import Constraint Programming modelization functions
from docplex.cp.model import CpoModel
The first thing to define is the model input data. In the case of the Golomb Ruler problem, there is only one input which is the order of the ruler, that is the number of marks:
In [ ]:
# Define required number of marks on the ruler
ORDER = 7
The model is represented by a Python object that is filled with the different model elements (variables, constraints, objective function, etc). The first thing to do is then to create such an object:
In [ ]:
# Create model object
mdl = CpoModel(name="GolombRuler")
Now, we need to define the variables of the problem. As the expected problem result is the list of mark positions, the simplest choice is to create one integer variable to represent the position of each mark on the ruler.
Each variable has a a set of possible values called his domain. To reduce the search space, it is important to reduce this domain as far as possible.
In our case, we can naively estimate that the maximum distance between two adjacent marks is the order of the ruler minus one. Then the maximal position of a mark is (ORDER - 1)². Each variable domain is then limited to an interval [0..(ORDER - 1)²].
A list of integer variables can be defined using method integer_var_list(). In our case, defining one variable for each mark can be created as follows:
In [ ]:
# Create array of variables corresponding to ruler marks
marks = mdl.integer_var_list(ORDER, 0, (ORDER - 1) ** 2, "M")
In [ ]:
# Create an array with all distances between all marks
dist = [marks[i] - marks[j] for i in range(1, ORDER) for j in range(0, i)]
We have used here the operator '-' to express the difference between variables. It may appear strange as the variables are not instanciated at that time, but the Python operator has been overloaded to construct a CP expression instead of attempting to compute the arithmetic difference. All other standard Python operators can be used to make operations between CP objects (<, >, <=, >=, ==, !=, +, -, /, *, &, |, //, **, ...). Have a look to documentation for details.
To force all these distances to be different, we use the special all_diff() constraint as follows:
In [ ]:
# Force all distances to be different
mdl.add(mdl.all_diff(dist))
The call mdl.add(...) is necessary to express that the constraint must be added to the model.
The constraint we have expressed above is theoritically enough, and the model can be solved as it is.
However, it does not differentiate between all possible permutations of the different mark positions that are solutions to the problem, for example, 0-1-4-6, 4-6-1-0, 6-0-1-4, etc. As there are ORDER! (factorial of ORDER) such permutations, the search space would be drastically reduced by removing them.
We can do that by forcing an order between marks, for example the order of their index:
In [ ]:
# Avoid symmetric solutions by ordering marks
for i in range(1, ORDER):
mdl.add(marks[i] > marks[i - 1])
We also know that first mark is at the beginning of the ruler:
In [ ]:
# Force first mark position to zero
mdl.add(marks[0] == 0)
Each optimal solution has a mirror, with all mark distances in the reverse order, for example, 0-1-4-6 and 0-2-5-6. The following constraint can be added to avoid this:
In [ ]:
# Avoid mirror solution
mdl.add((marks[1] - marks[0]) < (marks[ORDER - 1] - marks[ORDER - 2]))
In [ ]:
# Minimize ruler size
mdl.add(mdl.minimize(marks[ORDER - 1]))
If the marks were not ordered, we should have use instead:
mdl.add(mdl.minimize(mdl.max(marks)))
By default, the modeling layer look for a local runtime, but other solving environments, such as docloud, are also available. Look at the documentation for a good understanding of the various solving/generation modes.
If you're using a Community Edition of CPLEX runtimes, depending on the size of the problem, the solve stage may fail and will need a paying subscription or product installation.
The model can be solved by calling:
In [ ]:
# Solve the model
print("Solving model....")
msol = mdl.solve(TimeLimit=10)
In [ ]:
# Print solution
print("Solution: ")
msol.write()
This output is totally generic and simply prints the value of all model variables, the objective value, and some other solving information.
A more specific output can be generated by writing more code. The following example illustrates how to access specific elements of the solution.
In [ ]:
# Print solution
from sys import stdout
if msol:
# Print found solution
stdout.write("Solution: " + msol.get_solve_status() + "\n")
stdout.write("Position of ruler marks: ")
for v in marks:
stdout.write(" " + str(msol[v]))
stdout.write("\n")
stdout.write("Solve time: " + str(round(msol.get_solve_time(), 2)) + "s\n")
else:
# No solution found
stdout.write("No solution found. Search status: " + msol.get_solve_status() + "\n")
Another possibility is for example to simulate real ruler using characters, as follows:
In [ ]:
# Print solution as a ruler
if msol:
stdout.write("Ruler: +")
for i in range(1, ORDER):
stdout.write('-' * (msol[marks[i]] - msol[marks[i - 1]] - 1) + '+')
stdout.write("\n")
The last available installable package is available on Pypi here: https://pypi.python.org/pypi/docplex
A complete set of modeling examples can be downloaded here: https://github.com/IBMDecisionOptimization/docplex-examples
Copyright © 2017, 2018 IBM. IPLA licensed Sample Materials.