Portfolio Optimization using Quandl, Bokeh and Gurobi

Borrowed and updated from Michael C. Grant, Continuum Analytics


In [1]:
import pandas as pd
import numpy as np
from math import sqrt
import sys
from bokeh.plotting import figure, show, ColumnDataSource, save
from bokeh.models import Range1d, HoverTool
from bokeh.io import output_notebook, output_file
import quandl
from gurobipy import *
# output_notebook() #To enable Bokeh output in notebook, uncomment this line

First of all, we need some data to proceed. For that purpose we use Quandl. First, you're going to need the quandl package. This isn't totally necessary, as pulling from the API is quite simple with or without the package, but it does make it a bit easier and knocks out a few steps. The Quandl package can be downloaded here. If we set up quandl, next thing to do is to choose some stocks to import. The following is a random selection of stocks.


In [2]:
APIToken = "xxx-xxxxxx"

quandlcodes = ["GOOG/NASDAQ_AAPL.4","WIKI/GOOGL.4", "GOOG/NASDAQ_CSCO.4","GOOG/NASDAQ_FB.4",
"GOOG/NASDAQ_MSFT.4","GOOG/NASDAQ_TSLA.4","GOOG/NASDAQ_YHOO.4","GOOG/PINK_CSGKF.4",
"YAHOO/F_EOAN.4","YAHOO/F_BMW.4","YAHOO/F_ADS.4","GOOG/NYSE_ABB.4","GOOG/VTX_ADEN.4",
"GOOG/VTX_NOVN.4","GOOG/VTX_HOLN.4","GOOG/NYSE_UBS.4", "GOOG/NYSE_SAP.4", "YAHOO/SW_SNBN.4",
"YAHOO/IBM.4", "YAHOO/RIG.4" , "YAHOO/CTXS.4", "YAHOO/INTC.4","YAHOO/KO.4",
"YAHOO/NKE.4","YAHOO/MCD.4","YAHOO/EBAY.4","GOOG/VTX_NESN.4","YAHOO/MI_ALV.4","YAHOO/AXAHF.4",
"GOOG/VTX_SREN.4"]

The command to import those stocks is quandl.get(). With trim_start and trim_end we can choose a desired time horizon.


In [3]:
data = quandl.get(quandlcodes,authtoken=APIToken, trim_start='2009-01-01', trim_end='2016-11-09', paginate=True, per_end_date={'gte': '2009-01-01'},
                  qopts={'columns':['ticker', 'per_end_date']})

Let's now calculate the growth rates and some stats:


In [4]:
GrowthRates = data.pct_change()*100
syms = GrowthRates.columns
Sigma = GrowthRates.cov()
stats = pd.concat((GrowthRates.mean(),GrowthRates.std()),axis=1)
stats.columns = ['Mean_return', 'Volatility']
extremes = pd.concat((stats.idxmin(),stats.min(),stats.idxmax(),stats.max()),axis=1)
extremes.columns = ['Minimizer','Minimum','Maximizer','Maximum']
stats


Out[4]:
Mean_return Volatility
GOOG/NASDAQ_AAPL - Close 0.125833 1.726076
WIKI/GOOGL - Close 0.069063 1.967627
GOOG/NASDAQ_CSCO - Close 0.047938 1.731333
GOOG/NASDAQ_FB - Close 0.136107 2.556192
GOOG/NASDAQ_MSFT - Close 0.069622 1.602043
GOOG/NASDAQ_TSLA - Close 0.184646 3.345699
GOOG/NASDAQ_YHOO - Close 0.081633 2.021945
GOOG/PINK_CSGKF - Close 0.000057 2.690247
YAHOO/F_EOAN - Close -0.057462 1.879740
YAHOO/F_BMW - Close 0.082739 2.040693
YAHOO/F_ADS - Close 0.094261 1.729286
GOOG/NYSE_ABB - Close 0.035269 1.884311
GOOG/VTX_ADEN - Close 0.041308 1.933576
GOOG/VTX_NOVN - Close 0.021955 1.184016
GOOG/VTX_HOLN - Close 0.017115 2.011954
GOOG/NYSE_UBS - Close 0.036881 2.706204
GOOG/NYSE_SAP - Close 0.056749 1.628628
YAHOO/SW_SNBN - Close 0.035416 1.782714
YAHOO/IBM - Close 0.037257 1.290793
YAHOO/RIG - Close -0.041073 2.869053
YAHOO/CTXS - Close 0.088172 2.216246
YAHOO/INTC - Close 0.054862 1.617349
YAHOO/KO - Close 0.010891 1.522319
YAHOO/NKE - Close 0.031609 2.284062
YAHOO/MCD - Close 0.035156 1.033587
YAHOO/EBAY - Close 0.067272 2.389571
GOOG/VTX_NESN - Close 0.031053 1.026613
YAHOO/MI_ALV - Close 0.053178 2.010690
YAHOO/AXAHF - Close 0.027366 2.623103
GOOG/VTX_SREN - Close 0.053448 2.192983

As we move towards our Markowitz portfolio designs it makes sense to view the stocks on a mean/variance scatter plot.


In [5]:
fig = figure(tools="pan,box_zoom,reset,resize")
source = ColumnDataSource(stats)
hover = HoverTool(tooltips=[('Symbol','@index'),('Volatility','@Volatility'),('Mean return','@Mean_return')])
fig.add_tools(hover)
fig.circle('Volatility', 'Mean_return', size=5, color='maroon', source=source)
fig.text('Volatility', 'Mean_return', syms, text_font_size='10px', x_offset=3, y_offset=-2, source=source)
fig.xaxis.axis_label='Volatility (standard deviation)'
fig.yaxis.axis_label='Mean return'
output_file("portfolio.html")
show(fig)

Gurobi

Time to bring in the big guns. Expressed in mathematical terms, we will be solving models in this form:

$$\begin{array}{lll} \text{minimize} & x^T \Sigma x \\ \text{subject to} & \sum_i x_i = 1 & \text{fixed budget} \\ & r^T x = \gamma & \text{fixed return} \\ & x \geq 0 \end{array}$$

In this model, the optimization variable $x\in\mathbb{R}^N$ is a vector representing the fraction of the budget allocated to each stock; that is, $x_i$ is the amount allocated to stock $i$. The paramters of the model are the mean returns $r$, a covariance matrix $\Sigma$, and the target return $\gamma$. What we will do is sweep $\gamma$ between the worst and best returns we have seen above, and compute the portfolio that achieves the target return but with as little risk as possible.

The covariance matrix $\Sigma$ merits some explanation. Along the diagonal, it contains the squares of the volatilities (standard deviations) computed above. But off the diagonal, it contains measures of the correlation between two stocks: that is, whether they tend to move in the same direction (positive correlation), in opposite directions (negative correlation), or a mixture of both (small correlation). This entire matrix is computed with a single call to Pandas.

Building the base model

We are not solving just one model here, but literally hundreds of them, with different return targets and with the short constraints added or removed. One very nice feature of the Gurobi Python interface is that we can build a single "base" model, and reuse it for each of these scenarios by adding and removing constraints.

First, let's initialize the model and declare the variables. As we mentioned above, we're creating separate variables for the long and short positions. We put these variables into a Pandas DataFrame for easy organization, and create a third column that holds the difference between the long and short variables---that is, the net allocations for each stock. Another nice feature of Gurobi's Python interface is that the variable objects can be used in simple linear and quadratic expressions using familar Python syntax.


In [6]:
# Instantiate our model
m = Model("portfolio")

# Create one variable for each stock
portvars = [m.addVar(name=symb,lb=0.0) for symb in syms]
portvars[7]=m.addVar(name='GOOG/PINK_CSGKF - Close',lb=0.0,ub=0.5)
portvars = pd.Series(portvars, index=syms)
portfolio = pd.DataFrame({'Variables':portvars})

# Commit the changes to the model
m.update()

# The total budget
p_total = portvars.sum()

# The mean return for the portfolio
p_return = stats['Mean_return'].dot(portvars)

# The (squared) volatility of the portfolio
p_risk = Sigma.dot(portvars).dot(portvars)

# Set the objective: minimize risk
m.setObjective(p_risk, GRB.MINIMIZE)

# Fix the budget
m.addConstr(p_total, GRB.EQUAL, 1)

# Select a simplex algorithm (to ensure a vertex solution)
m.setParam('Method', 1)

m.optimize()


Changed value of parameter Method to 1
   Prev: -1  Min: -1  Max: 4  Default: -1
Optimize a model with 1 rows, 31 columns and 30 nonzeros
Model has 465 quadratic objective terms
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [0e+00, 0e+00]
  QObjective range [5e-04, 2e+01]
  Bounds range     [5e-01, 5e-01]
  RHS range        [1e+00, 1e+00]
Presolve removed 0 rows and 1 columns
Presolve time: 0.01s
Presolved: 1 rows, 30 columns, 30 nonzeros
Presolved model has 465 quadratic objective terms

Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    0.0000000e+00   0.000000e+00   0.000000e+00      0s
      12    4.9948178e-01   0.000000e+00   0.000000e+00      0s

Solved in 12 iterations and 0.02 seconds
Optimal objective  4.994817828e-01

Minimum Risk Model

We have set our objective to minimize risk, and fixed our budget at 1. The model we solved above gave us the minimum risk model.


In [7]:
portfolio['Minimum risk'] = portvars.apply(lambda x:x.getAttr('x'))
portfolio


Out[7]:
Variables Minimum risk
GOOG/NASDAQ_AAPL - Close <gurobi.Var GOOG/NASDAQ_AAPL - Close (value 0.... 0.019100
WIKI/GOOGL - Close <gurobi.Var WIKI/GOOGL - Close (value 0.003253... 0.003253
GOOG/NASDAQ_CSCO - Close <gurobi.Var GOOG/NASDAQ_CSCO - Close (value 0.0)> 0.000000
GOOG/NASDAQ_FB - Close <gurobi.Var GOOG/NASDAQ_FB - Close (value 0.02... 0.024476
GOOG/NASDAQ_MSFT - Close <gurobi.Var GOOG/NASDAQ_MSFT - Close (value 0.0)> 0.000000
GOOG/NASDAQ_TSLA - Close <gurobi.Var GOOG/NASDAQ_TSLA - Close (value 0.0)> 0.000000
GOOG/NASDAQ_YHOO - Close <gurobi.Var GOOG/NASDAQ_YHOO - Close (value 0.0)> 0.000000
GOOG/PINK_CSGKF - Close <gurobi.Var GOOG/PINK_CSGKF - Close (value 0.0)> 0.000000
YAHOO/F_EOAN - Close <gurobi.Var YAHOO/F_EOAN - Close (value 0.0)> 0.000000
YAHOO/F_BMW - Close <gurobi.Var YAHOO/F_BMW - Close (value 0.0)> 0.000000
YAHOO/F_ADS - Close <gurobi.Var YAHOO/F_ADS - Close (value 0.00280... 0.002803
GOOG/NYSE_ABB - Close <gurobi.Var GOOG/NYSE_ABB - Close (value 0.0)> 0.000000
GOOG/VTX_ADEN - Close <gurobi.Var GOOG/VTX_ADEN - Close (value 0.0)> 0.000000
GOOG/VTX_NOVN - Close <gurobi.Var GOOG/VTX_NOVN - Close (value 0.124... 0.124355
GOOG/VTX_HOLN - Close <gurobi.Var GOOG/VTX_HOLN - Close (value 0.0)> 0.000000
GOOG/NYSE_UBS - Close <gurobi.Var GOOG/NYSE_UBS - Close (value 0.0)> 0.000000
GOOG/NYSE_SAP - Close <gurobi.Var GOOG/NYSE_SAP - Close (value 0.0)> 0.000000
YAHOO/SW_SNBN - Close <gurobi.Var YAHOO/SW_SNBN - Close (value 0.146... 0.146450
YAHOO/IBM - Close <gurobi.Var YAHOO/IBM - Close (value 0.0895428... 0.089543
YAHOO/RIG - Close <gurobi.Var YAHOO/RIG - Close (value 0.0)> 0.000000
YAHOO/CTXS - Close <gurobi.Var YAHOO/CTXS - Close (value 0.0)> 0.000000
YAHOO/INTC - Close <gurobi.Var YAHOO/INTC - Close (value 0.0)> 0.000000
YAHOO/KO - Close <gurobi.Var YAHOO/KO - Close (value 0.08424357... 0.084244
YAHOO/NKE - Close <gurobi.Var YAHOO/NKE - Close (value 0.0)> 0.000000
YAHOO/MCD - Close <gurobi.Var YAHOO/MCD - Close (value 0.2588231... 0.258823
YAHOO/EBAY - Close <gurobi.Var YAHOO/EBAY - Close (value 0.0)> 0.000000
GOOG/VTX_NESN - Close <gurobi.Var GOOG/VTX_NESN - Close (value 0.246... 0.246953
YAHOO/MI_ALV - Close <gurobi.Var YAHOO/MI_ALV - Close (value 0.0)> 0.000000
YAHOO/AXAHF - Close <gurobi.Var YAHOO/AXAHF - Close (value 0.0)> 0.000000
GOOG/VTX_SREN - Close <gurobi.Var GOOG/VTX_SREN - Close (value 0.0)> 0.000000

In [8]:
# Add the return target
ret50 = 0.5 * extremes.loc['Mean_return','Maximum']
fixreturn = m.addConstr(p_return, GRB.EQUAL, ret50)

m.optimize()

portfolio['50% Max'] = portvars.apply(lambda x:x.getAttr('x'))


Optimize a model with 2 rows, 31 columns and 60 nonzeros
Model has 465 quadratic objective terms
Coefficient statistics:
  Matrix range     [6e-05, 1e+00]
  Objective range  [0e+00, 0e+00]
  QObjective range [5e-04, 2e+01]
  Bounds range     [5e-01, 5e-01]
  RHS range        [9e-02, 1e+00]
Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    0.0000000e+00   0.000000e+00   0.000000e+00      0s
      11    9.5121296e-01   0.000000e+00   0.000000e+00      0s

Solved in 11 iterations and 0.01 seconds
Optimal objective  9.512129585e-01

The efficient frontier

Now what we're going to do is sweep our return target over a range of values, starting at the smallest possible value to the largest. For each, we construct the minimum-risk portfolio. This will give us a tradeoff curve that is known in the business as the efficient frontier or the Pareto-optimal curve.

Note that we're using the same model object we've already constructed! All we have to do is set the return target and re-optimize for each value of interest.


In [9]:
m.setParam('OutputFlag',False)

# Determine the range of returns. Make sure to include the lowest-risk
# portfolio in the list of options
minret = extremes.loc['Mean_return','Minimum']
maxret = extremes.loc['Mean_return','Maximum']
riskret = extremes.loc['Volatility','Minimizer']
riskret = stats.loc[riskret,'Mean_return']
riskret =sum(portfolio['Minimum risk']*stats['Mean_return'])
returns = np.unique(np.hstack((np.linspace(minret,maxret,10000),riskret)))


# Iterate through all returns
risks = returns.copy()
for k in range(len(returns)):
    fixreturn.rhs = returns[k]
    m.optimize()
    risks[k] = sqrt(p_risk.getValue())

In [10]:
fig = figure(tools="pan,box_zoom,reset,resize")

# Individual stocks
fig.circle(stats['Volatility'], stats['Mean_return'], size=5, color='maroon')
fig.text(stats['Volatility'], stats['Mean_return'], syms, text_font_size='10px', x_offset=3, y_offset=-2)
fig.circle('Volatility', 'Mean_return', size=5, color='maroon', source=source)
# Divide the efficient frontier into two sections: those with
# a return less than the minimum risk portfolio, those that are greater.
tpos_n = returns >= riskret
tneg_n = returns <= riskret
fig.line(risks[tneg_n], returns[tneg_n], color='red')
fig.line(risks[tpos_n], returns[tpos_n], color='blue')

fig.xaxis.axis_label='Volatility (standard deviation)'
fig.yaxis.axis_label='Mean return'
fig.legend.orientation='bottom_left'
output_file("efffront.html")
show(fig)


INFO:bokeh.core.state:Session output file 'efffront.html' already exists, will be overwritten.