Use decision optimization to determine Cloud balancing.

This tutorial includes everything you need to set up decision optimization engines, build mathematical programming models, and a solve a capacitated facility location problem to do server load balancing.

When you finish this tutorial, you'll have a foundational knowledge of Prescriptive Analytics.

This notebook is part of Prescriptive Analytics for Python

It requires either an installation of CPLEX Optimizers or it can be run on IBM Watson Studio Cloud (Sign up for a free IBM Cloud account and you can start using Watson Studio Cloud right away).

Table of contents:

The business problem
How decision optimization (prescriptive analytics) can help
Use decision optimization
Summary

The business problem: Capacitated Facility Location.

The description of the problem can be found here: http://blog.yhat.com/posts/how-yhat-does-cloud-balancing.html

How decision optimization can help

Prescriptive analytics (decision optimization) technology recommends actions that are based on desired outcomes. It takes into account specific scenarios, resources, and knowledge of past and current events. With this insight, your organization can make better decisions and have greater control of business outcomes.
Prescriptive analytics is the next step on the path to insight-based actions. It creates value through synergy with predictive analytics, which analyzes data to predict future outcomes.
Prescriptive analytics takes that insight to the next level by suggesting the optimal way to handle that future situation. Organizations that can act fast in dynamic conditions and make superior decisions in uncertain environments gain a strong competitive advantage.

With prescriptive analytics, you can:

Automate the complex decisions and trade-offs to better manage your limited resources.
Take advantage of a future opportunity or mitigate a future risk.
Proactively update recommendations based on changing events.
Meet operational goals, increase customer loyalty, prevent threats and fraud, and optimize business processes.

Use decision optimization

Step 1: Import the library

Run the following code to import the Decision Optimization CPLEX Modeling library. The DOcplex library contains the two modeling packages, Mathematical Programming (docplex.mp) and Constraint Programming (docplex.cp).



In [ ]:

    
import sys
try:
    import docplex.mp
except:
    raise Exception('Please install docplex. See https://pypi.org/project/docplex/')

Step 2: Model the data

In this scenario, the data is simple and is delivered in the json format under the Optimization github.



In [ ]:

    
from collections import namedtuple



In [ ]:

    
class TUser(namedtuple("TUser", ["id", "running", "sleeping", "current_server"])):
    def __str__(self):
        return self.id



In [ ]:

    
try:
    from StringIO import StringIO
except ImportError:
    from io import StringIO



In [ ]:

    
try:
    from urllib2 import urlopen
except ImportError:
    from urllib.request import urlopen



In [ ]:

    
import csv

data_url = "https://github.com/vberaudi/utwt/blob/master/users.csv?raw=true"
xld = urlopen(data_url).read()
xlds = StringIO(xld.decode('utf-8'))
reader = csv.reader(xlds)

users = [(row[0], int(row[1]), int(row[2]), row[3]) for row in reader]

Step 3: Prepare the data

Given the number of teams in each division and the number of intradivisional and interdivisional games to be played, you can calculate the total number of teams and the number of weeks in the schedule, assuming every team plays exactly one game per week.

The season is split into halves, and the number of the intradivisional games that each team must play in the first half of the season is calculated.



In [ ]:

    
max_processes_per_server = 50

users = [TUser(*user_row) for user_row in users]


servers = list({t.current_server for t in users})

Step 4: Set up the prescriptive model



In [ ]:

    
from docplex.mp.environment import Environment
env = Environment()
env.print_information()

Create the DOcplex model

The model contains all the business constraints and defines the objective.



In [ ]:

    
from docplex.mp.model import Model

mdl = Model("truck")

Define the decision variables



In [ ]:

    
active_var_by_server = mdl.binary_var_dict(servers, name='isActive')

def user_server_pair_namer(u_s):
    u, s = u_s
    return '%s_to_%s' % (u.id, s)

assign_user_to_server_vars = mdl.binary_var_matrix(users, servers, user_server_pair_namer)

max_sleeping_workload = mdl.integer_var(name="max_sleeping_processes")



In [ ]:

    
def _is_migration(user, server):
    """ Returns True if server is not the user's current
        Used in setup of constraints.
    """
    return server != user.current_server

Express the business constraints



In [ ]:

    
mdl.add_constraints(
    mdl.sum(assign_user_to_server_vars[u, s] * u.running for u in users) <= max_processes_per_server
    for s in servers)
mdl.print_information()



In [ ]:

    
# each assignment var <u, s>  is <= active_server(s)
for s in servers:
    for u in users:
        ct_name = 'ct_assign_to_active_{0!s}_{1!s}'.format(u, s)
        mdl.add_constraint(assign_user_to_server_vars[u, s] <= active_var_by_server[s], ct_name)



In [ ]:

    
# sum of assignment vars for (u, all s in servers) == 1
for u in users:
    ct_name = 'ct_unique_server_%s' % (u[0])
    mdl.add_constraint(mdl.sum((assign_user_to_server_vars[u, s] for s in servers)) == 1.0, ct_name)
mdl.print_information()



In [ ]:

    
number_of_active_servers = mdl.sum((active_var_by_server[svr] for svr in servers))
mdl.add_kpi(number_of_active_servers, "Number of active servers")

number_of_migrations = mdl.sum(
    assign_user_to_server_vars[u, s] for u in users for s in servers if _is_migration(u, s))
mdl.add_kpi(number_of_migrations, "Total number of migrations")


for s in servers:
    ct_name = 'ct_define_max_sleeping_%s' % s
    mdl.add_constraint(
        mdl.sum(
            assign_user_to_server_vars[u, s] * u.sleeping for u in users) <= max_sleeping_workload,
        ct_name)
mdl.add_kpi(max_sleeping_workload, "Max sleeping workload")
mdl.print_information()

Express the objective



In [ ]:

    
# Set objective function
mdl.minimize(number_of_active_servers)

mdl.print_information()

Solve with Decision Optimization

You will get the best solution found after n seconds, due to a time limit parameter.



In [ ]:

    
# build an ordered sequence of goals
ordered_kpi_keywords = ["servers", "migrations", "sleeping"]
ordered_goals = [mdl.kpi_by_name(k) for k in ordered_kpi_keywords]

mdl.solve_lexicographic(ordered_goals)
mdl.report()

Step 5: Investigate the solution and then run an example analysis



In [ ]:

    
active_servers = sorted([s for s in servers if active_var_by_server[s].solution_value == 1])


print("Active Servers: {}".format(active_servers))

print("*** User assignment ***")

for (u, s) in sorted(assign_user_to_server_vars):
    if assign_user_to_server_vars[(u, s)].solution_value == 1:
        print("{} uses {}, migration: {}".format(u, s, "yes" if _is_migration(u, s) else "no"))
print("*** Servers sleeping processes ***")
for s in active_servers:
    sleeping = sum(assign_user_to_server_vars[u, s].solution_value * u.sleeping for u in users)
    print("Server: {} #sleeping={}".format(s, sleeping))

Summary

You learned how to set up and use IBM Decision Optimization CPLEX Modeling for Python to formulate a Constraint Programming model and solve it with IBM Decision Optimization on Cloud.

References

Decision Optimization CPLEX Modeling for Python documentation
Decision Optimization on Cloud
Need help with DOcplex or to report a bug? Please go here.
Contact us at dofeedback@wwpdl.vnet.ibm.com.



In [ ]: