This tutorial includes everything you need to set up IBM Decision Optimization CPLEX Modeling for Python (DOcplex), build a Mathematical Programming model, and get its solution by solving the model with IBM ILOG CPLEX Optimizer.
When you finish this tutorial, you'll have a foundational knowledge of Prescriptive Analytics.
This notebook is part of Prescriptive Analytics for Python
It requires either an installation of CPLEX Optimizers or it can be run on IBM Watson Studio Cloud (Sign up for a free IBM Cloud account and you can start using Watson Studio Cloud right away).
Table of contents:
A set of business constraints have to be respected:
Prescriptive analytics technology recommends actions based on desired outcomes, taking into account specific scenarios, resources, and knowledge of past and current events. This insight can help your organization make better decisions and have greater control of business outcomes.
Prescriptive analytics is the next step on the path to insight-based actions. It creates value through synergy with predictive analytics, which analyzes data to predict future outcomes.
Prescriptive analytics takes that insight to the next level by suggesting the optimal way to handle that future situation. Organizations that can act fast in dynamic conditions and make superior decisions in uncertain environments gain a strong competitive advantage.
For example:
The predictions show which offers a customer is most likely to accept, and the confidence that they will accept, depending on each customer’s details.
For example:
(139987, "Pension", 0.13221, "Mortgage", 0.10675)
indicates that customer Id=139987 will certainly not buy a Pension as the level is only 13.2%,
whereas
(140030, "Savings", 0.95678, "Pension", 0.84446)
is more than likely to buy Savings and a Pension as the rates are 95.7% and 84.4%.
This data is taken from a SPSS example, except that the names of the customers were modified.
A Python data analysis library, pandas, is used to store the data. Let's set up and declare the data.
In [ ]:
import pandas as pd
names = {
139987 : "Guadalupe J. Martinez", 140030 : "Michelle M. Lopez", 140089 : "Terry L. Ridgley",
140097 : "Miranda B. Roush", 139068 : "Sandra J. Wynkoop", 139154 : "Roland Guérette", 139158 : "Fabien Mailhot",
139169 : "Christian Austerlitz", 139220 : "Steffen Meister", 139261 : "Wolfgang Sanger",
139416 : "Lee Tsou", 139422 : "Sanaa' Hikmah Hakimi", 139532 : "Miroslav Škaroupka",
139549 : "George Blomqvist", 139560 : "Will Henderson", 139577 : "Yuina Ohira", 139580 : "Vlad Alekseeva",
139636 : "Cassio Lombardo", 139647 : "Trinity Zelaya Miramontes", 139649 : "Eldar Muravyov", 139665 : "Shu T'an",
139667 : "Jameel Abdul-Ghani Gerges", 139696 : "Zeeb Longoria Marrero", 139752 : "Matheus Azevedo Melo",
139832 : "Earl B. Wood", 139859 : "Gabrielly Sousa Martins", 139881 : "Franca Palermo"}
data = [(139987, "Pension", 0.13221, "Mortgage", 0.10675), (140030, "Savings", 0.95678, "Pension", 0.84446), (140089, "Savings", 0.95678, "Pension", 0.80233),
(140097, "Pension", 0.13221, "Mortgage", 0.10675), (139068, "Pension", 0.80506, "Savings", 0.28391), (139154, "Pension", 0.13221, "Mortgage", 0.10675),
(139158, "Pension", 0.13221, "Mortgage", 0.10675),(139169, "Pension", 0.13221, "Mortgage", 0.10675), (139220, "Pension", 0.13221, "Mortgage", 0.10675),
(139261, "Pension", 0.13221, "Mortgage", 0.10675), (139416, "Pension", 0.13221, "Mortgage", 0.10675), (139422, "Pension", 0.13221, "Mortgage", 0.10675),
(139532, "Savings", 0.95676, "Mortgage", 0.82269), (139549, "Savings", 0.16428, "Pension", 0.13221), (139560, "Savings", 0.95678, "Pension", 0.86779),
(139577, "Pension", 0.13225, "Mortgage", 0.10675), (139580, "Pension", 0.13221, "Mortgage", 0.10675), (139636, "Pension", 0.13221, "Mortgage", 0.10675),
(139647, "Savings", 0.28934, "Pension", 0.13221), (139649, "Pension", 0.13221, "Mortgage", 0.10675), (139665, "Savings", 0.95675, "Pension", 0.27248),
(139667, "Pension", 0.13221, "Mortgage", 0.10675), (139696, "Savings", 0.16188, "Pension", 0.13221), (139752, "Pension", 0.13221, "Mortgage", 0.10675),
(139832, "Savings", 0.95678, "Pension", 0.83426), (139859, "Savings", 0.95678, "Pension", 0.75925), (139881, "Pension", 0.13221, "Mortgage", 0.10675)]
products = ["Car loan", "Savings", "Mortgage", "Pension"]
productValue = [100, 200, 300, 400]
budgetShare = [0.6, 0.1, 0.2, 0.1]
availableBudget = 500
channels = pd.DataFrame(data=[("gift", 20.0, 0.20), ("newsletter", 15.0, 0.05), ("seminar", 23.0, 0.30)], columns=["name", "cost", "factor"])
Offers are stored in a pandas DataFrame.
In [ ]:
try: # Python 2
offers = pd.DataFrame(data=data, index=xrange(0, len(data)), columns=["customerid", "Product1", "Confidence1", "Product2", "Confidence2"])
except: # Python 3
offers = pd.DataFrame(data=data, index=range(0, len(data)), columns=["customerid", "Product1", "Confidence1", "Product2", "Confidence2"])
offers.insert(0,'name',pd.Series(names[i[0]] for i in data))
Let's customize the display of this data and show the confidence forecast for each customer.
In [ ]:
CSS = """
body {
margin: 0;
font-family: Helvetica;
}
table.dataframe {
border-collapse: collapse;
border: none;
}
table.dataframe tr {
border: none;
}
table.dataframe td, table.dataframe th {
margin: 0;
border: 1px solid white;
padding-left: 0.25em;
padding-right: 0.25em;
}
table.dataframe th:not(:empty) {
background-color: #fec;
text-align: left;
font-weight: normal;
}
table.dataframe tr:nth-child(2) th:empty {
border-left: none;
border-right: 1px dashed #888;
}
table.dataframe td {
border: 2px solid #ccf;
background-color: #f4f4ff;
}
table.dataframe thead th:first-child {
display: none;
}
table.dataframe tbody th {
display: none;
}
"""
In [ ]:
from IPython.core.display import HTML
HTML('<style>{}</style>'.format(CSS))
from IPython.display import display
try:
display(offers.drop('customerid',1).sort_values(by='name')) #Pandas >= 0.17
except:
display(offers.drop('customerid',1).sort('name')) #Pandas < 0.17
In [ ]:
import sys
try:
import docplex.mp
except:
raise Exception('Please install docplex. See https://pypi.org/project/docplex/')
If cplex is not installed, install CPLEX Community edition.
In [ ]:
try:
import cplex
except:
raise Exception('Please install CPLEX. See https://pypi.org/project/cplex/')
In [ ]:
from docplex.mp.model import Model
mdl = Model(name="marketing_campaign")
channelVars
, represent whether or not a customer will be made an offer for a particular product via a particular channel.totaloffers
represents the total number of offers made.budgetSpent
represents the total cost of the offers made.
In [ ]:
try: # Python 2
offersR = xrange(0, len(offers))
productsR = xrange(0, len(products))
channelsR = xrange(0, len(channels))
except: # Python 3
offersR = range(0, len(offers))
productsR = range(0, len(products))
channelsR = range(0, len(channels))
channelVars = mdl.binary_var_cube(offersR, productsR, channelsR)
totaloffers = mdl.integer_var(lb=0)
budgetSpent = mdl.continuous_var()
In [ ]:
# Only 1 product is offered to each customer
mdl.add_constraints( mdl.sum(channelVars[o,p,c] for p in productsR for c in channelsR) <=1
for o in offersR)
mdl.add_constraint( totaloffers == mdl.sum(channelVars[o,p,c]
for o in offersR
for p in productsR
for c in channelsR) )
mdl.add_constraint( budgetSpent == mdl.sum(channelVars[o,p,c]*channels.at[c, "cost"]
for o in offersR
for p in productsR
for c in channelsR) )
# Balance the offers among products
for p in productsR:
mdl.add_constraint( mdl.sum(channelVars[o,p,c] for o in offersR for c in channelsR)
<= budgetShare[p] * totaloffers )
# Do not exceed the budget
mdl.add_constraint( mdl.sum(channelVars[o,p,c]*channels.at[c, "cost"]
for o in offersR
for p in productsR
for c in channelsR) <= availableBudget )
mdl.print_information()
In [ ]:
mdl.maximize(
mdl.sum( channelVars[idx,p,idx2] * c.factor * productValue[p]* o.Confidence1
for p in productsR
for idx,o in offers[offers['Product1'] == products[p]].iterrows()
for idx2, c in channels.iterrows())
+
mdl.sum( channelVars[idx,p,idx2] * c.factor * productValue[p]* o.Confidence2
for p in productsR
for idx,o in offers[offers['Product2'] == products[p]].iterrows()
for idx2, c in channels.iterrows())
)
In [ ]:
s = mdl.solve()
assert s, "No Solution !!!"
In [ ]:
report = [(channels.at[c, "name"], products[p], names[offers.at[o, "customerid"]])
for c in channelsR
for p in productsR
for o in offersR if channelVars[o,p,c].solution_value==1]
assert len(report) == totaloffers.solution_value
print("Marketing plan has {0} offers costing {1}".format(totaloffers.solution_value, budgetSpent.solution_value))
report_bd = pd.DataFrame(report, columns=['channel', 'product', 'customer'])
display(report_bd)
Then let's focus on seminar.
In [ ]:
display(report_bd[report_bd['channel'] == "seminar"].drop('channel',1))
Copyright © 2017-2019 IBM. IPLA licensed Sample Materials.
In [ ]: