JuMPeR Tutorial

MIT Robust Optimization Class, Feb 20th, 2015

Today I'm going to be teaching you about JuMPeR, an extension for JuMP for robust optimization. I'm somewhat assuming you are familar with Julia/JuMP, but we'll review JuMP in the process of learning JuMPeR.

Motivational Problem: Portfolio Optimization

We're going to be looking at a fairly simple problem today, portfolio optimization. The general goal is to select a portfolio of assets to maximize return, but we also take into account the "risk" of the portfolio. In general, the higher the return, the higher the risk. Our goal is to find portfolios that trade off return and risk in such a way as to make sure that, for a given risk appetite/tolerance, we find the highest return.

Review of JuMP: Deterministic Portfolio Optimization

To get warmed up, we'll model the classic Markowitz portfolio optmization problem. Given $n$ assets, with average returns $\mathbf{r}$ and covariance matrix of returns $\mathbf{\Sigma}$, we can express the problem of finding the minimum risk portfolio for a given return as $$ \begin{alignat}{2} \min_{\mathbf{x}\geq 0} \quad & \mathbf{x}^T \mathbf{\Sigma} \mathbf{x} \\ \text{subject to} \quad & \mathbf{e}^T \cdot \mathbf{x} = 1 \\ & \mathbf{r}^T \cdot \mathbf{x} \geq R \end{alignat} $$ You can imagine a flipped problem where we maximize return subject to a risk tolerance. We also note that we've defined risk to be the variance of the resulting portfolio - this isn't probably a good idea in practice, but this is an older-style model.



In [ ]:

    
# We'll define some helper functions to generate data
# and to simulate the performance of our portfolios.
# Make sure you have Distributions installed first
#Pkg.add("Distributions")
using Distributions
function generate_data(n, num_obs)
    data = zeros(num_obs, n)

    # Linking factors
    beta = [(i-1.0)/n for i = 1:n] 

    for obs_ind = 1:num_obs
        # Common market factor, mean 3%, sd 5%, truncate at +- 3 sd
        z = rand(Normal(0.03, 0.05))
        z = max(z, 0.03 - 3*0.05)
        z = min(z, 0.03 + 3*0.05)

        for asset_ind = 1:n
            # Idiosyncratic contribution, mean 0%, sd 5%, truncated at +- 3 sd
            asset = rand(Normal(0.00, 0.05))
            asset = max(asset, 0.00 - 3*0.05)
            asset = min(asset, 0.00 + 3*0.05)
            data[obs_ind, asset_ind] = beta[asset_ind] * z + asset
        end
    end
    return 100*data
end

# Alternative data generation function
#function generate_data(n, num_obs)
#    # Single factor CAPM model
#    z = .1 * randn(num_obs) + .2
#    betas = linspace(0, 1, n)
#    z * betas' + .05*randn(num_obs, n)
#end

srand(2000)
returns = generate_data(5, 10000)
println("Mean returns:")
println( round( mean(returns,1), 2) )
println("Covariance of returns:")
println( round( cov(returns),    3) )

function eval_portfolio(returns, x)
    port_returns = returns * x
    # Collect summary stats for the returns
    sort!(port_returns)
    m = length(port_returns)
    @printf(" Minimum: %6.3f\n", port_returns[1])
    @printf("25th per: %6.3f\n", port_returns[div(m,4)])
    @printf("50th per: %6.3f\n", port_returns[div(m,2)])
    @printf("75th per: %6.3f\n", port_returns[div(3m,4)])
    @printf(" Maximum: %6.3f\n", port_returns[end])
    @printf("    Mean: %6.3f\n", mean(port_returns))
    @printf("Std. Dev: %6.3f\n", std(port_returns))
    @printf("Variance: %7.4f\n", var(port_returns))
end

println()
println("Stats for a portfolio that invests in all equally")
eval_portfolio(returns, [0.2, 0.2, 0.2, 0.2, 0.2])
println()
println("Maximum return portfolio")
eval_portfolio(returns, [0.0, 0.0, 0.0, 0.0, 1.0])



In [ ]:

    
# We can finally look at the deterministic optimization problem
using JuMP, Gurobi

function solve_det_port(returns, min_return)
    n = size(returns,2)
    # Create a JuMP Model, using Gurobi as the solver
    detport = Model(solver=GurobiSolver())
    @defVar(detport, 0 <= x[1:n] <= 1)
    # Must be a valid allocation
    @addConstraint(detport, sum{x[i], i=1:n} == 1)
    # Must make min return
    r = mean(returns,1)
    @addConstraint(detport, sum{r[i]*x[i], i=1:n} >= min_return)
    # Objective: minimize risk (variance)
    Σ = cov(returns)
    @setObjective(detport, Min, sum{Σ[i,j]*x[i]*x[j], i=1:n, j=1:n})
    # Solve it!
    solve(detport)
    # Return the portfolio
    getValue(x)[:]
end

# Try it out
srand(1988)
train_returns = generate_data(10, 10000)
# Must make at least...
detsol = solve_det_port(train_returns, 1.216)
println("Portfolio")
println(round(detsol,2))
println("Portfolio on training data")
eval_portfolio(train_returns, detsol)

Learning JuMPeR: Robust Portfolio Optimization

In a robust portfolio optmization problem, we still have our $n$ assets and our goal is to balance risk and return. Instead of using the mean and variance of the portfolio, we'll instead try to maximize the worst-case return over our uncertainty set, i.e. $$ \begin{alignat}{2} \min_{\mathbf{x}\geq 0} \quad & \min_{r \in U} \mathbf{r}^T \cdot \mathbf{x} \\ \text{subject to} \quad & \mathbf{e}^T \cdot \mathbf{x} = 1 \end{alignat} $$ The tradeoff between risk and return is baked into the uncertainty set $U$.

Uncertainty Set 1: Box

We'll start with a somewhat silly uncertainty set, a box. Given past returns, calculate average return $\bar{\mathbf{r}}$ and the standard deviation of return $\bar{\mathbf{\sigma}}$ for each asset. Then our uncertainty set is $$ \begin{alignat}{2} U & = \left\{ \bar{r}_i - \alpha*\sigma_i \leq r \leq \bar{r}_i + \alpha*\sigma_i \right\} \end{alignat} $$ where $\alpha$ controls the size of the box - $\alpha=0$ is just the average return.



In [ ]:

    
# Need to use JuMP as well as JuMPeR
using JuMP, JuMPeR

function solve_box_port(returns, α)
    n = size(returns,2)
    # Create a JuMPeR RobustModel - mostly like a JuMP Model
    # in terms of construction
    boxport = RobustModel(solver=GurobiSolver())
    # Create our variables
    @defVar(boxport, 0 <= x[1:n] <= 1)
    # Must be a valid allocation
    @addConstraint(boxport, sum{x[i], i=1:n} == 1)

    # Uncertainty
    # First lets calculate the data we need
    avg_r = mean(returns,1)
    σ     = std(returns,1)
    # Create our Uncertains
    # This is pretty much the same as constructing a variable!
    @defUnc(boxport, avg_r[i] - α*σ[i] <= r[i=1:n] <= avg_r[i] + α*σ[i])
    # We don't need any additional constraints right now on our
    # uncertainty set than those ranges.
    
    # Objective
    # JuMPeR does not allow for uncertain objectives - they must
    # be written as constraints.
    @defVar(boxport, obj)
    @setObjective(boxport, Max, obj)
    # Uncertains can be mixed with variables just like data
    @addConstraint(boxport, obj <= sum{r[i]*x[i], i=1:n})
    
    # Print it to have a look
    print(boxport)
    
    # Solve it!
    solve(boxport)
    #solve(boxport, debug_printfinal=true)
    
    # Return the portfolio
    getValue(x)[:]
end

# Try it out
srand(1988)
train_returns = generate_data(10, 10000)
boxsol = solve_box_port(train_returns, 1)
# Look at how we now have 12 rows, 31 columns
println("Portfolio")
println(round(boxsol,2))
println("Portfolio on training data")
eval_portfolio(train_returns, boxsol)

Uncertainty Set 2: Polyhedral

Lets do one step better than a box: a polyhedron. Polyhedral uncertainty sets are pretty popular in the literature because they don't turn (MI)LPs into (MI)SOCPs, even though an ellipsoidal set would seem to be more appropriate. The set we will do is inspired by a definition of a multivariate normal distribution. Given average returns $\bar{\mathbf{r}}$, a covariance matrix of returns $\Sigma$, we can say (assuming returns are normal) $$ \begin{alignat}{2} \mathbf{r} & = \Sigma^\frac{1}{2} \mathbf{z} + \bar{\mathbf{r}} \end{alignat} $$ We'll use that to inspire the uncertainty set: $$ \begin{alignat}{2} U & = \left\{ \mathbf{r} = \Sigma^{0.5} \mathbf{z} + \bar{\mathbf{r}}, \quad \| \mathbf{z} \|_1 \leq \Gamma \right\} \end{alignat} $$ where $\Gamma$ controls the conservativeness.



In [ ]:

    
function solve_poly_port(returns, Γ)
    n = size(returns,2)

    polyport = RobustModel(solver=GurobiSolver())
    @defVar(polyport, 0 <= x[1:n] <= 1)
    @addConstraint(polyport, sum{x[i], i=1:n} == 1)

    # Uncertainty
    # Calculate the data we need
    avg_r = mean(returns,1)
    Σ     = cov(returns)
    Σhalf = chol(Σ)
    
    # Create our Uncertains
    # The returns
    @defUnc(polyport, r[i=1:n])
    # The factors
    @defUnc(polyport, z[i=1:n])
    # The absolute value of the factors
    @defUnc(polyport, zabs[i=1:n] >= 0)
    # Link all these together
    # Constraints that are JUST on Uncertains are added
    # to the uncertainty set automatically, not to the problem
    for i in 1:n
        @addConstraint(polyport, r[i] == sum{Σhalf[i,j]*z[j], j=1:n} + avg_r[i])
        @addConstraint(polyport, zabs[i] >=  z[i])
        @addConstraint(polyport, zabs[i] >= -z[i])
    end
    @addConstraint(polyport, sum{zabs[i], i=1:n} <= Γ)
    
    # Objective - same as before
    @defVar(polyport, obj <= 10)
    @setObjective(polyport, Max, obj)
    @addConstraint(polyport, obj <= sum{r[i]*x[i], i=1:n})
    
    # Print it to have a look
    #print(polyport)
    
    # Solve it!
    solve(polyport)
    #solve(polyport, prefer_cuts=true)
    #solve(boxport, debug_printfinal=true)
    
    # Return the portfolio
    getValue(x)[:]
end

# Try it out
srand(1988)
train_returns = generate_data(10, 10000)
polysol = solve_poly_port(train_returns, 2)
println("Polyhedral Portfolio")
println(round(polysol,2))
println("Polyhedral Portfolio on training data")
eval_portfolio(train_returns, polysol)

Uncertainty Set 3: Ellipsoidal

We can do the same uncertainty set as before, with the norm changed to a 2-norm, i.e. the set is an ellipse $$ \begin{alignat}{2} U & = \left\{ \mathbf{r} = \Sigma^{0.5} \mathbf{z} + \bar{\mathbf{r}}, \quad \| \mathbf{z} \|_2 \leq \Gamma \right\} \end{alignat} $$ where $\Gamma$ controls the conservativeness again.



In [ ]:

    
function solve_ell_port(returns, Γ)
    n = size(returns,2)

    ellport = RobustModel(solver=GurobiSolver())
    @defVar(ellport, 0 <= x[1:n] <= 1)
    @addConstraint(ellport, sum{x[i], i=1:n} == 1)

    # Uncertainty
    # Calculate the data we need
    avg_r = mean(returns,1)
    Σ     = cov(returns)
    Σhalf = chol(Σ)
    
    # Create our Uncertains
    # The returns
    @defUnc(ellport, r[i=1:n])
    # The factors
    @defUnc(ellport, z[i=1:n])
    # Link all these together
    # Constraints that are JUST on Uncertains are added
    # to the uncertainty set automatically, not to the problem
    for i in 1:n
        @addConstraint(ellport, r[i] == sum{Σhalf[i,j]*z[j], j=1:n} + avg_r[i])
    end
    # Add the norm/ellipse constraint
    addEllipseConstraint(ellport, [z[i] for i in 1:n], Γ)
    
    # Objective - same as before
    @defVar(ellport, obj)
    @setObjective(ellport, Max, obj)
    @addConstraint(ellport, obj <= sum{r[i]*x[i], i=1:n})
    
    # Print it to have a look
    #print(ellport)
    
    # Solve it!
    solve(ellport)
    #solve(ellport, debug_printfinal=true)
    #solve(ellport, prefer_cuts=true)
    
    # Return the portfolio
    getValue(x)[:]
end

# Try it out
srand(1988)
train_returns = generate_data(10, 10000)
ellsol = solve_ell_port(train_returns, 10)
println("Ellipsoidal Portfolio")
println(round(ellsol,2))
println("Ellipsoidal Portfolio on training data")
eval_portfolio(train_returns, ellsol)

Advanced JuMPeR: Oracles

JuMPeR's design is focussed around oracles. Oracles are responsible for taking RO problems, or parts of them, and transforming them into something solveable. That could be reformulating uncertain constraints into deterministic constraints, it could be generating cutting-planes, or something else entirely. Oracles are also intimately connected to uncertainty sets - for example, we can provide data to an oracle from which it can generate cutting-planes - an uncertainty set never needs to explictly constructed. Finally oracles are interchangeable - you can obtain oracles from others or create your own to allow others to explore the performance of different sets and implementations.

JuMPeR currently comes with three oracles:

GeneralOracle, the default oracle. It takes an explicit polyhedral/ellipsoidal representation of the uncertainty set and can generate a reformulation or use cutting-planes.
GeneralGraphOracle, a variation on GeneralOracle. This oracle attempts to discover if the uncertain parameters actually belong to seperate, disjunct uncertainty sets. This allows it to generate smaller reformulations, and a seperator cut generator for each set if using cutting-planes.
BertSimOracle, implements the uncertainty set described in the 2004 paper The Price of Robustness by Bertsimas and Sim. Will generate cutting-planes efficiently using sorting instead of solving an LP. The GeneralOracle should be used if a reformulation is desired.

Today we'll experiment with this feature by using some of the uncertainty sets defined in

D. Bertsimas, V. Gupta and N. Kallus, "Data-Driven Robust Optimization"

that have been turned into JuMPeR oracles by Vishal Gupta, and make available as DDUS.jl



In [ ]:

    
# First, install DDUS.jl
#Pkg.clone("https://github.com/vgupta1/DDUS.jl.git")
import DDUS



In [ ]:

    
function solve_dd_port(returns, α, β)
    n = size(returns,2)

    ddport = RobustModel(solver=GurobiSolver(OutputFlag=0))
    @defVar(ddport, 0 <= x[1:n] <= 1)
    @addConstraint(ddport, sum{x[i], i=1:n} == 1)
    
    # Create our Uncertains
    @defUnc(ddport, r[i=1:n])
    # The oracle defines an uncertainty set implicitly
    # from data - we don't provide it
    # There are many uncertainty sets defined in the paper
    # We'll test out the UCS set
    oracle = DDUS.UCSOracle(returns, α, β)
    # We need to tell JuMPeR to use the UCSOracle for the
    # uncertaint constraints
    setDefaultOracle!(ddport, oracle)
    
    # Objective - same as before
    @defVar(ddport, obj <= 1000)
    @setObjective(ddport, Max, obj)
    @addConstraint(ddport, obj <= sum{r[i]*x[i], i=1:n})

    # Solve it!
    solve(ddport, prefer_cuts=true)
    #solve(ddport, prefer_cuts=true, debug_printfinal=true)
    #solve(ellport, prefer_cuts=true)
    
    # Return the portfolio
    getValue(x)[:]
end

# Try it out
srand(1988)
train_returns = generate_data(10, 1000)
ddsol = solve_dd_port(train_returns, 0.1, 0.2)
println("Data-driven (UCS) Portfolio")
println(round(ddsol,2))
println("Data-driven (UCS) Portfolio on training data")
eval_portfolio(train_returns, ddsol)