This tutorial includes everything you need to set up decision optimization engines, build mathematical programming models, and arrive at managing a truck fleet.
When you finish this tutorial, you'll have a foundational knowledge of Prescriptive Analytics.
This notebook is part of Prescriptive Analytics for Python
It requires either an installation of CPLEX Optimizers or it can be run on IBM Watson Studio Cloud (Sign up for a free IBM Cloud account and you can start using Watson Studio Cloud right away).
Table of contents:
A trucking company has a hub and spoke system. The shipments to be delivered are specified by an originating spoke, a destination spoke, and a shipment volume. The trucks have different types defined by a maximum capacity, a speed, and a cost per mile. The model assigns the correct number of trucks to each route in order to minimize the cost of transshipment and meet the volume requirements. There is a minimum departure time and a maximum return time for trucks at a spoke, and a load and unload time at the hub. Trucks of different types travel at different speeds. Therefore, shipments are available at each hub in a timely manner. Volume availability constraints are taken into account, meaning that the shipments that will be carried back from a hub to a spoke by a truck must be available for loading before the truck leaves.
The assumptions are:
Prescriptive analytics (decision optimization) technology recommends actions that are based on desired outcomes. It takes into account specific scenarios, resources, and knowledge of past and current events. With this insight, your organization can make better decisions and have greater control of business outcomes.
Prescriptive analytics is the next step on the path to insight-based actions. It creates value through synergy with predictive analytics, which analyzes data to predict future outcomes.
Prescriptive analytics takes that insight to the next level by suggesting the optimal way to handle that future situation. Organizations that can act fast in dynamic conditions and make superior decisions in uncertain environments gain a strong competitive advantage.
With prescriptive analytics, you can:
In [ ]:
import sys
try:
import docplex.mp
except:
raise Exception('Please install docplex. See https://pypi.org/project/docplex/')
If CPLEX is not installed, install CPLEX Community edition.
In [ ]:
try:
import cplex
except:
raise Exception('Please install CPLEX. See https://pypi.org/project/cplex/')
In [ ]:
from collections import namedtuple
In [ ]:
_parameters = namedtuple('parameters', ['maxTrucks', 'maxVolume'])
_location = namedtuple('location', ['name'])
_spoke = namedtuple('spoke', ['name', 'minDepTime', 'maxArrTime'])
_truckType = namedtuple('truckType', ['truckType', 'capacity', 'costPerMile', 'milesPerHour'])
_loadTimeInfo = namedtuple('loadTimeInfo', ['hub', 'truckType', 'loadTime'])
_routeInfo = namedtuple('routeInfo', ['spoke', 'hub', 'distance'])
_triple = namedtuple('triple', ['origin', 'hub', 'destination'])
_shipment = namedtuple('shipment', ['origin', 'destination', 'totalVolume'])
In [ ]:
import requests
import json
import decimal
r = requests.get("https://github.com/vberaudi/utwt/blob/master/trucking.json?raw=true")
json_data = json.loads(r.text, parse_float=decimal.Decimal )
In [ ]:
def read_json_tuples(name, my_namedtuple):
json_fragment = json_data[name]
length = len(my_namedtuple._fields)
ret = []
for t in json_fragment:
#print t
ret2 = [0 for i in range(length)]
for i in range(length):
field = my_namedtuple._fields[i]
ret2[i] = t[field]
ret.append(my_namedtuple(*tuple(ret2)))
return ret
def read_json_tuple(name, my_namedtuple):
json_fragment = json_data[name]
length = len(my_namedtuple._fields)
ret = [0 for i in range(length)]
for i in range(length):
field = my_namedtuple._fields[i]
ret[i] = json_fragment[field]
return my_namedtuple(*tuple(ret))
Use basic HTML and a stylesheet to format the data.
In [ ]:
CSS = """
body {
margin: 0;
font-family: Helvetica;
}
table.dataframe {
border-collapse: collapse;
border: none;
}
table.dataframe tr {
border: none;
}
table.dataframe td, table.dataframe th {
margin: 0;
border: 1px solid white;
padding-left: 0.25em;
padding-right: 0.25em;
}
table.dataframe th:not(:empty) {
background-color: #fec;
text-align: left;
font-weight: normal;
}
table.dataframe tr:nth-child(2) th:empty {
border-left: none;
border-right: 1px dashed #888;
}
table.dataframe td {
border: 2px solid #ccf;
background-color: #f4f4ff;
}
table.dataframe thead th:first-child {
display: none;
}
table.dataframe tbody th {
display: none;
}
"""
from IPython.core.display import HTML
HTML('<style>{}</style>'.format(CSS))
In [ ]:
parameters = read_json_tuple(name='Parameters', my_namedtuple=_parameters)
hubs = read_json_tuples(name='Hubs', my_namedtuple=_location)
truckTypes = read_json_tuples(name='TruckTypes', my_namedtuple=_truckType)
spokes = read_json_tuples(name='Spokes', my_namedtuple=_spoke)
loadTimes = read_json_tuples(name='LoadTimes', my_namedtuple=_loadTimeInfo)
routes = read_json_tuples(name='Routes', my_namedtuple=_routeInfo)
shipments = read_json_tuples(name='Shipments', my_namedtuple=_shipment)
Given the number of teams in each division and the number of intradivisional and interdivisional games to be played, you can calculate the total number of teams and the number of weeks in the schedule, assuming every team plays exactly one game per week.
The season is split into halves, and the number of the intradivisional games that each team must play in the first half of the season is calculated.
In [ ]:
maxTrucks = parameters.maxTrucks;
maxVolume = parameters.maxVolume;
hubIds = {h.name for h in hubs}
spokeIds = {s.name for s in spokes}
spoke = {s.name : s for s in spokes}
truckTypeIds = {ttis.truckType for ttis in truckTypes}
truckTypeInfos = {tti.truckType : tti for tti in truckTypes}
loadTime = {(lt.hub , lt.truckType) : lt.loadTime for lt in loadTimes}
# feasible pathes from spokes to spokes via one hub
triples = {_triple(r1.spoke, r1.hub, r2.spoke) for r1 in routes for r2 in routes if (r1 != r2 and r1.hub == r2.hub)}
Some asserts to check the data follows the guidelines.
In [ ]:
# Make sure the data is consistent: latest arrive time >= earliest departure time
for s in spokeIds:
assert spoke[s].maxArrTime > spoke[s].minDepTime, "inconsistent data"
# The following assertion is to make sure that the spoke
# in each route is indeed in the set of Spokes.
for r in routes:
assert r.spoke in spokeIds, "some route is not in the spokes"
# The following assertion is to make sure that the hub
# in each route are indeed in the set of Hubs.
for r in routes:
assert r.hub in hubIds, "some route is not in the hubs"
# The following assertion is to make sure that the origin
# of each shipment is indeed in the set of Spokes.
for s in shipments:
assert s.origin in spokeIds, "origin is not in the set of Spokes"
# The following assertion is to make sure that the destination
# of each shipment is indeed in the set of Spokes.
for s in shipments:
assert s.destination in spokeIds, "shipment is not in the set of Spokes"
In [ ]:
from math import ceil, floor
# the earliest unloading time at a hub for each type of trucks
earliestUnloadingTime = {(r, t) : int(ceil(loadTime[r.hub, t] + spoke[r.spoke].minDepTime + 60 * r.distance / truckTypeInfos[t].milesPerHour)) for t in truckTypeIds for r in routes}
# the latest loading time at a hub for each type of trucks
latestLoadingTime = {(r, t) : int(floor(spoke[r.spoke].maxArrTime - loadTime[r.hub, t] - 60 * r.distance / truckTypeInfos[t].milesPerHour)) for t in truckTypeIds for r in routes}
# Compute possible truck types that can be assigned on a route
# A type of truck can be assigned on a route only if it can make it to the hub and back
# before the max arrival time at the spoke.
possibleTruckOnRoute = {(r, t) : 1 if earliestUnloadingTime[r, t] < latestLoadingTime[r, t] else 0 for t in truckTypeIds for r in routes}
In [ ]:
from docplex.mp.environment import Environment
env = Environment()
env.print_information()
In [ ]:
from docplex.mp.model import Model
model = Model("truck")
In [ ]:
truckOnRoute = model.integer_var_matrix(keys1=routes, keys2=truckTypeIds, lb=0, ub=maxTrucks, name="TruckOnRoute")
# This represents the volumes shipped out from each hub
# by each type of trucks on each triple
# The volumes are distinguished by trucktypes because trucks of different types
# arrive at a hub at different times and the timing is used in defining
# the constraints for volume availability for the trucks leaving the hub.
outVolumeThroughHubOnTruck = model.integer_var_matrix(keys1=triples, keys2=truckTypeIds, lb=0, ub=maxVolume, name="OutVolumeThroughHubOnTruck")
# This represents the volume shipped into each hub by each type of trucks on each triple
# It is used in defining timing constraints.
inVolumeThroughHubOnTruck = model.integer_var_matrix(keys1=triples, keys2=truckTypeIds, lb=0, ub=maxVolume, name="InVolumeThroughHubOnTruck")
In [ ]:
for r in routes:
for t in truckTypeIds:
model.add_constraint(truckOnRoute[r, t] <= possibleTruckOnRoute[r, t] * maxTrucks)
In [ ]:
for (s,h,dist) in routes:
for t in truckTypeIds:
model.add_constraint(
model.sum(inVolumeThroughHubOnTruck[(s1, h1, dest), t] for (s1, h1, dest) in triples if s == s1 and h1 == h)
<= truckOnRoute[(s, h, dist), t] * truckTypeInfos[t].capacity
)
In [ ]:
for tr in triples:
model.add_constraint(
model.sum(inVolumeThroughHubOnTruck[tr, t] for t in truckTypeIds)
== model.sum(outVolumeThroughHubOnTruck[tr, t] for t in truckTypeIds)
)
In [ ]:
for (o,d,v) in shipments:
model.add_constraint(model.sum(inVolumeThroughHubOnTruck[(o1, h, d1), t] for t in truckTypeIds for (o1, h, d1) in triples if o1 == o and d1 == d) == v)
In another words, the shipments for a truck must arrive at the hub from all spokes before the truck leaves. The constraint can be expressed as the following: For each route s-h and leaving truck of type t: Cumulated inbound volume arrived before the loading time of the truck >= Cumulated outbound volume upto the loading time of the truck(including the shipments being loaded).
In [ ]:
for (s,h,dist) in routes:
for t in truckTypeIds:
model.add_constraint(
# The expression below defines the indices of the trucks unloaded before truck t starts loading.
model.sum(inVolumeThroughHubOnTruck[(o, h, s), t1]
for (o,h0,s0) in triples if h0 == h and s0 == s
for t1 in truckTypeIds
for (o2,h2,dist1) in routes if h2 == h0 and o2 == o
if earliestUnloadingTime[(o, h, dist1), t1] <= latestLoadingTime[(s, h, dist), t])
>=
# The expression below defines the indices of the trucks left before truck t starts loading.
model.sum(outVolumeThroughHubOnTruck[(o, h, s), t2]
for (o,h0,s0) in triples if h0 == h and s0 == s
for t2 in truckTypeIds
for (o2,h2,dist2) in routes if h2 == h0 and o2 == o
if latestLoadingTime[(o, h, dist2), t2] <= latestLoadingTime[(s, h, dist), t])
)
In [ ]:
totalCost = model.sum(2 * r.distance * truckTypeInfos[t].costPerMile * truckOnRoute[r, t] for r in routes for t in truckTypeIds)
model.minimize(totalCost)
In [ ]:
model.print_information()
assert model.solve(), "!!! Solve of the model fails"
model.report()
In [ ]:
#solution object model
_result = namedtuple('result', ['totalCost'])
_nbTrucksOnRouteRes = namedtuple('nbTrucksOnRouteRes', ['spoke', 'hub', 'truckType', 'nbTruck'])
_volumeThroughHubOnTruckRes = namedtuple('volumeThroughHubOnTruckRes', ['origin', 'hub', 'destination', 'truckType', 'quantity'])
_aggregatedReport = namedtuple('aggregatedReport', ['spoke', 'hub', 'truckType', 'quantity'])
In [ ]:
# Post processing: result data structures are exported as post-processed tuple or list of tuples
# Solve objective value
import pandas as pd
result = _result(totalCost.solution_value)
nbTrucksOnRouteRes = pd.DataFrame([_nbTrucksOnRouteRes(r.spoke, r.hub, t, int(truckOnRoute[r, t]))
for r in routes
for t in truckTypeIds
if int(truckOnRoute[r, t]) > 0])
# Volume shipped into each hub by each type of trucks and each pair (origin, destination)
inVolumeThroughHubOnTruckRes = pd.DataFrame([_volumeThroughHubOnTruckRes(tr.origin, tr.hub, tr.destination, t, int(inVolumeThroughHubOnTruck[tr, t]))
for tr in triples
for t in truckTypeIds
if int(inVolumeThroughHubOnTruck[tr, t]) > 0])
# Volume shipped from each hub by each type of trucks and each pair (origin, destination)
outVolumeThroughHubOnTruckRes = pd.DataFrame([_volumeThroughHubOnTruckRes(tr.origin, tr.hub, tr.destination, t, int(outVolumeThroughHubOnTruck[tr, t]))
for tr in triples
for t in truckTypeIds
if int(outVolumeThroughHubOnTruck[tr, t]) > 0])
inBoundAggregated = pd.DataFrame([_aggregatedReport(r.spoke, r.hub, t, sum(int(inVolumeThroughHubOnTruck[tr, t])
for tr in triples if tr.origin == r.spoke and tr.hub == r.hub))
for r in routes
for t in truckTypeIds
if int(truckOnRoute[r, t]) > 0])
outBoundAggregated = pd.DataFrame([_aggregatedReport(r.spoke, r.hub, t, sum(int(outVolumeThroughHubOnTruck[tr, t])
for tr in triples if tr.destination == r.spoke and tr.hub == r.hub))
for r in routes
for t in truckTypeIds
if int(truckOnRoute[r, t]) > 0])
In [ ]:
from IPython.display import display
In [ ]:
display(nbTrucksOnRouteRes)
In [ ]:
display(inVolumeThroughHubOnTruckRes)
In [ ]:
display(outVolumeThroughHubOnTruckRes)
In [ ]:
display(inBoundAggregated)
In [ ]:
display(outBoundAggregated)
Copyright © 2017-2019 IBM. IPLA licensed Sample Materials.
In [ ]: