Backtest Overfitting

Algorithm 2.3 (CSCV - Combinatorially symmetric cross-validation)

Step 1 - form matrix M

Matrix $M$ has $N$ columns, which are the different strategies selected, to evaluate for potential overfit. We then have $t=1,2,...,T$ rows. Thus, you have performances of the $N$ strategies at different times. Keep in mind, that dates ($t=1,2,...,T$) have to be synchronised, i.e. if a performance measurement for strategy labelled $1$ is made on date $x$, then for all other strategies $2,3,4,...,N$ a measurement has to be made on the same date.

Assumption is made that performance measurements on $t=1,2,...,T$ are independent. Thus, you cannot just run $N$ strategies and take the performance measurements on different time-points because those would be autocorrelated. You can get away by defining "windows". For example, from $t=1$ to $t=2$ what was the starting equity and ending equity, and take a Sharpe, or annual-return. This will work. Though, in some special cases, where the strategy would have performed completely different (due to some kind of smoothing for example), this is not a feasible solution. And thus, one would be required to run independent $T$ backtests, for each of the $N$ strategies, to attempt and adhere to the independence assumption.

Step 2 - Partition $M$ into $S$ disjoint submatrices

Note that $S$ should be even. You end up with $M_s$ matrices, that are $\mathbb{R}^{T/S \times N}$

Step 3 - Form all combination of $M_s$

Form all $C_S$ combinations of $M_s$ matrices (taken in groups of $S/2$). So suppose that $M$ is something like the following: $$M= \begin{bmatrix} [ ---- M_1 ---- ] \\ [ ---- M_2 ---- ] \\ [ ---- M_3 ---- ] \\ [ ---- M_4 ---- ] \end{bmatrix}$$

Where $M \in \mathbb{R}^{T \times N}$ and each $M_s \in \mathbb{R}^{T/S \times N}$. Note that:

$$ {S \choose S/2} = \frac{(S-1)! S}{(S-S/2)!(S/2-1)! S/2} = {S-1 \choose S/2-1} \frac{S}{S/2} = ... = \prod_{i=0}^{S/2-1} \frac{S-i}{S/2-i} $$

And so according to the equation above, and to the example matrix, $S=4$, so the number of combinations is ${4 \choose 2 }= 6$, and these are:

$$C_4 = \{\{M_1,M_2\},\{M_1,M_3\},\{M_1,M_4\},\{M_2,M_3\},\{M_2,M_4\},\{M_3,M_4\}\}$$

Thus: $$|\{C_S\}| = {S \choose S/2}$$

Step 4

For each combination $c \in C_S$ (e.g.: $\{M_1,M_2\},\{M_1,M_3\},...)$:

a) Form $J$ (training set).

If $c=\{M_1,M_2\}$, then

$$J= \begin{bmatrix} [ ---- M_1 ---- ] \\ [ ---- M_2 ---- ] \end{bmatrix}$$

$J \in \mathbb{R}^{T/2 \times N}$

b) Form $\bar{J}$ (testing set).

If $c=\{M_1,M_2\}$, then

$$\bar{J}= \begin{bmatrix} [ ---- M_3 ---- ] \\ [ ---- M_4 ---- ] \end{bmatrix}$$

$\bar{J} \in \mathbb{R}^{T/2 \times N}$

Note that the order is preserved! For some performance measures, e.g. Sharpe, order is not important. But for drawdown, it certainly is.

c) Form vector $R^c$ (IS performance)

(probably by stitching together the performance measures at different times? So take the sum of the Sharpe ratios?). Derive $r^c$.

d) Form vector $\bar{R}^c$ (OOS performance).

Derive $\bar{r}^c$.

e) Determine the element $n^*$ such that $r^c_{n^*} \in \Omega^*_{n^*}$.

In other words, if $r_c=(1,4,2,3)$ and thus, the best performing strategy is at index $2$, so $n^*=2$. This is becasue, for example $\Omega_3^*$ when $N=4$ is $\{f \in \Omega | f_3=4 \} = \{ (1,2,4,3),(2,1,4,3), (1,3,4,2), (3,1,4,2), ... \}$, i.e. third strategy is always the best

f) Define the relative rank of $r_{n^*}^c$ (i.e. relative rank of the best performing strategy IS in the OOS).

$\bar{\omega_c} := \bar{r}^c_{n^*}/(N+1) \in (0,1)$

If the strategy optimization procedure is not overfitting, we should observe that $r_{n^*}^c$ systematically outperforms OOS

g) Define/compute the logit $\lambda_c = \ln{\frac{\bar{\omega_c}}{1-\bar{\omega_c}}}$.

High logit values imply a consistency between IS and OOS performances, which indicates a low level of overfitting

To summarize:

1) Form training set $J$

2) Form testing set $\bar{J}$

3) Form performances vector IS $R^c$, derive $r^c$

4) Form performances vector OOS $\bar{R}^c$, derive $\bar{r}^c$

5) Determine element $n^*$

6) Compute relative rank: $\bar{\omega}^c$

7) Compute logit: $\lambda_c$

Step 5 - Logit Frequency

Compute the distribution of ranks OOS by collecting all the $\lambda_c$, for $c \in C_S$

$$f(\lambda) = \sum_{c \in C_s} \frac{\chi_{\{\lambda\}} \left( \lambda_c\right)}{|\{C_S\}|}$$

Also note

$$f_{-\infty}^{\infty} f(\lambda) d\lambda = 1$$

Overfit Statistics

1) Probability of Backtest Overfitting (PBO)

The probability that the model configuration selected as optimal IS will underperform the median of the N model configurations OOS.

$$\text{PBO} = \sum_{n=1}^N P\left[ \bar{r_n} < N/2 | r \in \Omega^*_{n} \right] P\left[ r \in \Omega^*_n \right] = \int_{-\infty}^0 f(\lambda) d\lambda$$

Where,

$$f(\lambda) = \sum_{c \in C_S} \frac{\chi_{\{ \lambda\}} \left( \lambda_c \right)}{|\{ C_S \}|}$$

i.e. the frequency (PDF) of logits. They are discrete, thus the statement above makes sense ($\chi{\{\cdot\} \left( \cdot \right)}$ is an indicator function)

2) Performance degradation

This determines to what extent greater performance IS leads to lower performance OOS, an occurence associated with the memory effects discussed in Bailey et al. [1]

Peform a regression:

$$\bar{R_{n^*}}^c = \alpha + \beta R_{n^*}^c + \epsilon^c$$

3) Probability of loss

The probability that the model selecteed as optimal IS will deliver a loss OOS.

Compute:

$$P \left[ \bar{R_{n^*}}^c < 0 \right]$$

4) Stochastic dominance

This analysis determines whether the procedure used to select a strategy IS is preferable to randomly choosing one model configurations among the N altenratives.

First order stochastic dominance if:

$$P \left[ \bar{R_{n^*} \geq x} \right] \geq P \left[\text{Mean}(\bar{R}) \geq x \right] \ \ \forall x$$

and $$P \left[ \bar{R_{n^*} \geq x} \right] > P \left[\text{Mean}(\bar{R}) \geq x \right] \ \ \text{for some} \ \ x$$

A less demanding criterion is second-order stochastic dominance. This requires that:

$$\text{SD}2[x] = \int_{-\infty}^x (P\left[ \text{Mean}(\bar{R}) \leq x \right] - P\left[ \bar{R_{n^*}} \leq x \right]) dx \geq 0 \ \ \forall x$$

and that

$$\text{SD}2[x] > 0 \ \ \text{at some} \ \ x$$

CSCV example


In [148]:
%matplotlib inline
from numpy import inf
import matplotlib.pyplot as plt
import seaborn as sns # not required
sns.set_style('darkgrid') # not required
import numpy as np
import itertools
import pandas as pd
import scipy.stats

In [84]:
T = 16
S = 4
N = 30

M = np.random.rand(T, N)

M


Out[84]:
array([[  5.96100401e-01,   6.12068171e-01,   4.90046680e-01,
          3.68317677e-01,   2.84328370e-01,   4.23512291e-01,
          4.81132791e-01,   4.35510843e-01,   9.76998822e-01,
          7.80107014e-02,   7.71919311e-01,   8.92686158e-01,
          2.15235501e-01,   8.16012179e-02,   5.13492075e-01,
          7.55796776e-01,   5.40407247e-01,   5.77258227e-01,
          6.68513015e-01,   1.99101464e-01,   8.69991142e-01,
          2.91919263e-01,   6.44774661e-02,   3.54813826e-01,
          5.09583023e-01,   7.92406099e-01,   4.75189975e-01,
          9.25828167e-01,   9.06994770e-01,   3.54382721e-01],
       [  1.09992240e-01,   9.27317919e-01,   8.22580955e-01,
          4.28230859e-01,   2.20196814e-01,   9.23646517e-03,
          3.13450092e-01,   1.37183313e-03,   4.48260857e-01,
          5.82162279e-01,   8.01500292e-01,   2.32425967e-01,
          8.26222101e-01,   6.08631157e-01,   8.89367289e-02,
          4.15516382e-01,   2.02890299e-01,   3.97277491e-01,
          4.93145310e-02,   6.75540680e-01,   5.02314276e-01,
          1.42287711e-01,   7.59394693e-01,   7.32039940e-01,
          4.02131195e-01,   9.56356277e-01,   7.93856962e-01,
          8.46335657e-01,   6.78098704e-01,   2.68015528e-01],
       [  4.95271965e-01,   1.06376987e-01,   5.50226201e-02,
          9.19033512e-01,   9.62201504e-01,   8.99894505e-02,
          4.80518664e-02,   6.24369217e-01,   5.24762347e-01,
          4.60128405e-01,   2.59601224e-01,   3.63550462e-01,
          1.07164697e-01,   3.44224591e-01,   8.21036056e-01,
          8.06862841e-01,   6.44115759e-01,   5.90575848e-01,
          9.95989737e-01,   3.57079131e-01,   6.59271101e-01,
          9.48490287e-01,   1.09228588e-01,   8.67619253e-01,
          6.21357751e-01,   2.66938111e-01,   3.53080167e-01,
          5.55684330e-01,   5.80266283e-01,   6.23143200e-01],
       [  8.97472093e-03,   9.56771288e-01,   8.69995870e-01,
          7.69962718e-01,   3.50903197e-01,   1.67649412e-02,
          4.78149961e-01,   2.63120347e-01,   8.42403400e-01,
          3.81854467e-01,   6.61305843e-01,   8.15434741e-01,
          1.96525244e-01,   9.04862592e-01,   2.69671100e-01,
          5.27628922e-01,   4.54980359e-01,   9.34072694e-01,
          4.76913025e-01,   5.37778894e-01,   2.47892532e-01,
          2.15565715e-01,   9.86982927e-01,   3.81054037e-01,
          9.35229995e-01,   4.75784429e-01,   9.53922301e-01,
          2.84184448e-01,   5.86045291e-01,   4.39635002e-01],
       [  2.53625582e-01,   4.35884352e-01,   9.87910224e-01,
          6.50012793e-01,   6.26535842e-01,   2.28777162e-01,
          3.59233375e-02,   8.43683552e-01,   9.81719931e-01,
          7.51044348e-01,   9.32197136e-01,   5.66849880e-01,
          2.87819439e-01,   5.94401612e-01,   3.66935311e-02,
          8.74196810e-01,   3.51186771e-01,   1.64235230e-03,
          6.56678385e-01,   5.98460509e-01,   9.21944655e-01,
          8.92121119e-01,   6.75462901e-01,   3.38225621e-01,
          1.37682662e-01,   6.94802156e-01,   5.34104268e-01,
          4.64971928e-01,   6.51243218e-01,   9.23263353e-01],
       [  1.91430618e-01,   8.71261191e-01,   2.37693793e-01,
          7.06349337e-01,   8.42065287e-01,   3.15328385e-01,
          3.59356004e-02,   7.35286960e-01,   4.17615328e-01,
          7.00144320e-01,   9.50875235e-01,   2.08998796e-01,
          3.05158480e-01,   4.36991353e-01,   4.67440706e-01,
          6.47018941e-01,   1.31180717e-02,   1.29917831e-01,
          7.82527160e-01,   9.29849079e-01,   3.62330836e-01,
          7.21399935e-01,   8.14214907e-02,   7.05076048e-01,
          3.97146048e-01,   6.26667736e-01,   8.45701913e-01,
          7.30270340e-01,   6.22547366e-01,   2.99938839e-01],
       [  6.33544652e-01,   8.95177826e-01,   3.64129581e-01,
          4.03138861e-01,   7.36619432e-01,   5.20066018e-01,
          2.21607751e-01,   9.67875526e-01,   4.81141308e-01,
          4.58610155e-01,   5.61104272e-01,   4.36746991e-02,
          3.53928378e-01,   2.84937851e-01,   7.50289710e-01,
          2.79817548e-01,   4.76932456e-01,   2.04717291e-01,
          4.09212147e-01,   9.75140248e-01,   1.84323833e-01,
          5.05854252e-01,   9.17599363e-01,   5.58598662e-01,
          9.07438122e-01,   1.74550563e-01,   8.50694604e-01,
          4.13799609e-01,   4.46013065e-01,   5.08901275e-01],
       [  6.77625986e-01,   9.72271046e-01,   8.97138082e-01,
          2.63489911e-01,   9.27259445e-01,   3.92986458e-01,
          2.19704847e-01,   2.37276883e-01,   6.21544571e-01,
          9.37056764e-02,   5.45720183e-01,   9.29044971e-01,
          1.63950486e-01,   8.07889828e-01,   6.26632614e-01,
          2.98241718e-01,   5.81616304e-01,   5.49298115e-01,
          5.48229321e-01,   6.31042885e-01,   2.18339476e-01,
          7.12313712e-01,   5.57350625e-01,   7.09759072e-01,
          9.99247713e-01,   1.95797967e-01,   3.72177038e-01,
          2.09806157e-01,   9.42090536e-01,   6.50194502e-01],
       [  9.79316804e-01,   7.96846675e-01,   2.41915538e-01,
          2.20170758e-01,   8.88233429e-01,   5.56024989e-01,
          8.78466303e-01,   4.78631207e-01,   8.88738466e-01,
          7.37555234e-01,   6.93145828e-01,   9.16675254e-01,
          9.41537819e-01,   7.52225121e-01,   5.00516048e-01,
          8.25581556e-01,   3.01706793e-01,   2.93343096e-01,
          7.36316177e-01,   2.43406259e-01,   5.71130368e-01,
          4.46251642e-01,   1.89778155e-01,   6.28376056e-01,
          9.49397543e-01,   2.14092423e-01,   4.42207731e-01,
          4.37551963e-03,   5.88305948e-01,   9.98413168e-01],
       [  7.58722325e-01,   2.07674052e-01,   9.02002077e-01,
          7.11460904e-01,   6.29684495e-01,   4.90363415e-01,
          7.96955584e-01,   6.15731460e-01,   3.05595170e-01,
          8.92969642e-01,   6.74632596e-01,   1.84963160e-01,
          7.63934585e-01,   1.81910979e-01,   8.18465119e-01,
          1.67035847e-01,   2.26083501e-01,   5.97249624e-01,
          2.65957176e-02,   5.30958339e-01,   3.03613237e-01,
          8.33429733e-01,   5.81299777e-01,   3.10836896e-01,
          3.63913113e-01,   5.93086206e-01,   6.56856726e-01,
          5.73414462e-01,   5.52759734e-01,   6.22336816e-01],
       [  4.99600776e-02,   5.90333709e-02,   1.19992730e-01,
          1.55784475e-01,   2.72280826e-02,   9.77397837e-02,
          9.36032696e-01,   3.59180172e-01,   3.82854217e-01,
          6.62781990e-01,   9.91036078e-01,   8.00632564e-01,
          9.75939095e-01,   6.41573642e-01,   8.54589662e-01,
          4.80543150e-01,   8.35187958e-01,   5.07554090e-01,
          7.85456079e-01,   4.68152879e-01,   9.75441988e-01,
          7.60277534e-02,   1.95394253e-01,   2.86217971e-01,
          8.46593851e-01,   6.58072824e-01,   5.67687684e-01,
          1.81742868e-01,   5.98308739e-01,   3.23253660e-01],
       [  6.30189901e-01,   7.02505142e-01,   3.86416372e-01,
          6.54558055e-01,   9.01667532e-01,   3.08084785e-01,
          9.77801222e-01,   2.78190702e-01,   6.11049598e-01,
          6.75708191e-01,   2.56383358e-01,   3.79841730e-01,
          6.37367177e-01,   5.42295835e-02,   5.63911840e-01,
          1.00471425e-01,   4.17030595e-02,   9.53664018e-01,
          4.39640067e-01,   8.67676090e-01,   1.63995344e-01,
          3.38445002e-01,   9.74236172e-01,   3.49071845e-01,
          2.47776143e-02,   7.83852419e-01,   2.66529330e-01,
          4.40957954e-01,   9.90951479e-01,   1.61078003e-01],
       [  5.90642630e-01,   4.58293028e-01,   6.57607843e-01,
          5.27202202e-01,   5.51025676e-01,   4.71786420e-02,
          9.75460186e-01,   5.07370304e-01,   3.81496453e-01,
          8.19575449e-01,   4.92587224e-01,   4.59547937e-01,
          7.24198659e-02,   6.12107928e-01,   3.03782827e-01,
          1.75741093e-02,   2.25520934e-01,   9.64033526e-01,
          7.33546652e-01,   2.60414594e-01,   6.00809362e-01,
          1.28829474e-01,   2.65799156e-01,   3.02015032e-01,
          5.93816538e-01,   5.63080027e-01,   4.42377138e-01,
          5.69146905e-01,   4.52088043e-01,   6.85915396e-01],
       [  3.38190019e-01,   9.88633136e-02,   4.85365344e-01,
          2.90835328e-01,   8.60389651e-01,   1.69302307e-01,
          8.48125693e-01,   6.64687508e-01,   7.92089141e-01,
          9.32571741e-01,   4.52069646e-01,   7.75234782e-01,
          4.85221093e-01,   8.08991929e-01,   4.37171974e-01,
          8.42891968e-01,   6.37319711e-01,   2.59981718e-01,
          5.20214302e-01,   7.24707637e-01,   2.69366168e-01,
          2.34576127e-01,   2.57103994e-02,   3.54890998e-01,
          8.62024137e-01,   6.94579878e-01,   3.23376811e-01,
          5.91191251e-01,   7.05645936e-01,   9.83679887e-01],
       [  1.06517713e-04,   4.05472646e-01,   7.41305150e-01,
          8.38528473e-01,   5.36945329e-01,   8.79237174e-01,
          4.76667857e-01,   8.22726253e-01,   9.18995666e-01,
          3.99787498e-01,   5.21051680e-01,   1.31282148e-01,
          8.95466711e-01,   3.93664072e-01,   7.81152506e-01,
          4.93095101e-02,   9.11019720e-02,   7.97364194e-01,
          2.47407707e-01,   2.01443327e-01,   2.00571156e-01,
          6.99093131e-01,   6.67747514e-01,   4.35593820e-01,
          6.93677441e-02,   4.73942734e-01,   6.21677977e-01,
          8.46328462e-01,   1.01056635e-01,   9.09405859e-01],
       [  7.72420204e-01,   9.98810350e-01,   9.59423652e-01,
          5.21405848e-01,   6.84360663e-01,   8.50076261e-01,
          5.64303851e-01,   2.88584456e-01,   8.62331510e-01,
          5.76434488e-01,   4.41198087e-01,   1.03716030e-02,
          4.49733008e-01,   3.55235338e-01,   3.90953440e-01,
          4.46829738e-01,   3.23804583e-01,   4.65262792e-01,
          8.57902020e-01,   8.93339444e-01,   9.61079083e-01,
          8.75493948e-01,   1.96422810e-01,   1.08842946e-01,
          4.21685400e-01,   7.88706281e-01,   2.44868838e-02,
          1.08315787e-01,   3.06347258e-01,   5.72416081e-01]])

In [85]:
subMatrices = []

for i in range(int(T/S)):
    subMatrices.append(M[S*i:(S*i + int(S))])
    
subMatrices


Out[85]:
[array([[ 0.5961004 ,  0.61206817,  0.49004668,  0.36831768,  0.28432837,
          0.42351229,  0.48113279,  0.43551084,  0.97699882,  0.0780107 ,
          0.77191931,  0.89268616,  0.2152355 ,  0.08160122,  0.51349207,
          0.75579678,  0.54040725,  0.57725823,  0.66851301,  0.19910146,
          0.86999114,  0.29191926,  0.06447747,  0.35481383,  0.50958302,
          0.7924061 ,  0.47518998,  0.92582817,  0.90699477,  0.35438272],
        [ 0.10999224,  0.92731792,  0.82258096,  0.42823086,  0.22019681,
          0.00923647,  0.31345009,  0.00137183,  0.44826086,  0.58216228,
          0.80150029,  0.23242597,  0.8262221 ,  0.60863116,  0.08893673,
          0.41551638,  0.2028903 ,  0.39727749,  0.04931453,  0.67554068,
          0.50231428,  0.14228771,  0.75939469,  0.73203994,  0.4021312 ,
          0.95635628,  0.79385696,  0.84633566,  0.6780987 ,  0.26801553],
        [ 0.49527197,  0.10637699,  0.05502262,  0.91903351,  0.9622015 ,
          0.08998945,  0.04805187,  0.62436922,  0.52476235,  0.46012841,
          0.25960122,  0.36355046,  0.1071647 ,  0.34422459,  0.82103606,
          0.80686284,  0.64411576,  0.59057585,  0.99598974,  0.35707913,
          0.6592711 ,  0.94849029,  0.10922859,  0.86761925,  0.62135775,
          0.26693811,  0.35308017,  0.55568433,  0.58026628,  0.6231432 ],
        [ 0.00897472,  0.95677129,  0.86999587,  0.76996272,  0.3509032 ,
          0.01676494,  0.47814996,  0.26312035,  0.8424034 ,  0.38185447,
          0.66130584,  0.81543474,  0.19652524,  0.90486259,  0.2696711 ,
          0.52762892,  0.45498036,  0.93407269,  0.47691303,  0.53777889,
          0.24789253,  0.21556571,  0.98698293,  0.38105404,  0.93523   ,
          0.47578443,  0.9539223 ,  0.28418445,  0.58604529,  0.439635  ]]),
 array([[ 0.25362558,  0.43588435,  0.98791022,  0.65001279,  0.62653584,
          0.22877716,  0.03592334,  0.84368355,  0.98171993,  0.75104435,
          0.93219714,  0.56684988,  0.28781944,  0.59440161,  0.03669353,
          0.87419681,  0.35118677,  0.00164235,  0.65667839,  0.59846051,
          0.92194465,  0.89212112,  0.6754629 ,  0.33822562,  0.13768266,
          0.69480216,  0.53410427,  0.46497193,  0.65124322,  0.92326335],
        [ 0.19143062,  0.87126119,  0.23769379,  0.70634934,  0.84206529,
          0.31532838,  0.0359356 ,  0.73528696,  0.41761533,  0.70014432,
          0.95087524,  0.2089988 ,  0.30515848,  0.43699135,  0.46744071,
          0.64701894,  0.01311807,  0.12991783,  0.78252716,  0.92984908,
          0.36233084,  0.72139994,  0.08142149,  0.70507605,  0.39714605,
          0.62666774,  0.84570191,  0.73027034,  0.62254737,  0.29993884],
        [ 0.63354465,  0.89517783,  0.36412958,  0.40313886,  0.73661943,
          0.52006602,  0.22160775,  0.96787553,  0.48114131,  0.45861016,
          0.56110427,  0.0436747 ,  0.35392838,  0.28493785,  0.75028971,
          0.27981755,  0.47693246,  0.20471729,  0.40921215,  0.97514025,
          0.18432383,  0.50585425,  0.91759936,  0.55859866,  0.90743812,
          0.17455056,  0.8506946 ,  0.41379961,  0.44601306,  0.50890128],
        [ 0.67762599,  0.97227105,  0.89713808,  0.26348991,  0.92725944,
          0.39298646,  0.21970485,  0.23727688,  0.62154457,  0.09370568,
          0.54572018,  0.92904497,  0.16395049,  0.80788983,  0.62663261,
          0.29824172,  0.5816163 ,  0.54929812,  0.54822932,  0.63104288,
          0.21833948,  0.71231371,  0.55735062,  0.70975907,  0.99924771,
          0.19579797,  0.37217704,  0.20980616,  0.94209054,  0.6501945 ]]),
 array([[ 0.9793168 ,  0.79684668,  0.24191554,  0.22017076,  0.88823343,
          0.55602499,  0.8784663 ,  0.47863121,  0.88873847,  0.73755523,
          0.69314583,  0.91667525,  0.94153782,  0.75222512,  0.50051605,
          0.82558156,  0.30170679,  0.2933431 ,  0.73631618,  0.24340626,
          0.57113037,  0.44625164,  0.18977816,  0.62837606,  0.94939754,
          0.21409242,  0.44220773,  0.00437552,  0.58830595,  0.99841317],
        [ 0.75872233,  0.20767405,  0.90200208,  0.7114609 ,  0.6296845 ,
          0.49036341,  0.79695558,  0.61573146,  0.30559517,  0.89296964,
          0.6746326 ,  0.18496316,  0.76393459,  0.18191098,  0.81846512,
          0.16703585,  0.2260835 ,  0.59724962,  0.02659572,  0.53095834,
          0.30361324,  0.83342973,  0.58129978,  0.3108369 ,  0.36391311,
          0.59308621,  0.65685673,  0.57341446,  0.55275973,  0.62233682],
        [ 0.04996008,  0.05903337,  0.11999273,  0.15578447,  0.02722808,
          0.09773978,  0.9360327 ,  0.35918017,  0.38285422,  0.66278199,
          0.99103608,  0.80063256,  0.97593909,  0.64157364,  0.85458966,
          0.48054315,  0.83518796,  0.50755409,  0.78545608,  0.46815288,
          0.97544199,  0.07602775,  0.19539425,  0.28621797,  0.84659385,
          0.65807282,  0.56768768,  0.18174287,  0.59830874,  0.32325366],
        [ 0.6301899 ,  0.70250514,  0.38641637,  0.65455806,  0.90166753,
          0.30808479,  0.97780122,  0.2781907 ,  0.6110496 ,  0.67570819,
          0.25638336,  0.37984173,  0.63736718,  0.05422958,  0.56391184,
          0.10047142,  0.04170306,  0.95366402,  0.43964007,  0.86767609,
          0.16399534,  0.338445  ,  0.97423617,  0.34907185,  0.02477761,
          0.78385242,  0.26652933,  0.44095795,  0.99095148,  0.161078  ]]),
 array([[  5.90642630e-01,   4.58293028e-01,   6.57607843e-01,
           5.27202202e-01,   5.51025676e-01,   4.71786420e-02,
           9.75460186e-01,   5.07370304e-01,   3.81496453e-01,
           8.19575449e-01,   4.92587224e-01,   4.59547937e-01,
           7.24198659e-02,   6.12107928e-01,   3.03782827e-01,
           1.75741093e-02,   2.25520934e-01,   9.64033526e-01,
           7.33546652e-01,   2.60414594e-01,   6.00809362e-01,
           1.28829474e-01,   2.65799156e-01,   3.02015032e-01,
           5.93816538e-01,   5.63080027e-01,   4.42377138e-01,
           5.69146905e-01,   4.52088043e-01,   6.85915396e-01],
        [  3.38190019e-01,   9.88633136e-02,   4.85365344e-01,
           2.90835328e-01,   8.60389651e-01,   1.69302307e-01,
           8.48125693e-01,   6.64687508e-01,   7.92089141e-01,
           9.32571741e-01,   4.52069646e-01,   7.75234782e-01,
           4.85221093e-01,   8.08991929e-01,   4.37171974e-01,
           8.42891968e-01,   6.37319711e-01,   2.59981718e-01,
           5.20214302e-01,   7.24707637e-01,   2.69366168e-01,
           2.34576127e-01,   2.57103994e-02,   3.54890998e-01,
           8.62024137e-01,   6.94579878e-01,   3.23376811e-01,
           5.91191251e-01,   7.05645936e-01,   9.83679887e-01],
        [  1.06517713e-04,   4.05472646e-01,   7.41305150e-01,
           8.38528473e-01,   5.36945329e-01,   8.79237174e-01,
           4.76667857e-01,   8.22726253e-01,   9.18995666e-01,
           3.99787498e-01,   5.21051680e-01,   1.31282148e-01,
           8.95466711e-01,   3.93664072e-01,   7.81152506e-01,
           4.93095101e-02,   9.11019720e-02,   7.97364194e-01,
           2.47407707e-01,   2.01443327e-01,   2.00571156e-01,
           6.99093131e-01,   6.67747514e-01,   4.35593820e-01,
           6.93677441e-02,   4.73942734e-01,   6.21677977e-01,
           8.46328462e-01,   1.01056635e-01,   9.09405859e-01],
        [  7.72420204e-01,   9.98810350e-01,   9.59423652e-01,
           5.21405848e-01,   6.84360663e-01,   8.50076261e-01,
           5.64303851e-01,   2.88584456e-01,   8.62331510e-01,
           5.76434488e-01,   4.41198087e-01,   1.03716030e-02,
           4.49733008e-01,   3.55235338e-01,   3.90953440e-01,
           4.46829738e-01,   3.23804583e-01,   4.65262792e-01,
           8.57902020e-01,   8.93339444e-01,   9.61079083e-01,
           8.75493948e-01,   1.96422810e-01,   1.08842946e-01,
           4.21685400e-01,   7.88706281e-01,   2.44868838e-02,
           1.08315787e-01,   3.06347258e-01,   5.72416081e-01]])]

In [86]:
combinations = itertools.combinations(subMatrices, int(S/2))

In [87]:
# Step 4 - 1) form J
J = np.array(next(combinations))

J


Out[87]:
array([[[ 0.5961004 ,  0.61206817,  0.49004668,  0.36831768,  0.28432837,
          0.42351229,  0.48113279,  0.43551084,  0.97699882,  0.0780107 ,
          0.77191931,  0.89268616,  0.2152355 ,  0.08160122,  0.51349207,
          0.75579678,  0.54040725,  0.57725823,  0.66851301,  0.19910146,
          0.86999114,  0.29191926,  0.06447747,  0.35481383,  0.50958302,
          0.7924061 ,  0.47518998,  0.92582817,  0.90699477,  0.35438272],
        [ 0.10999224,  0.92731792,  0.82258096,  0.42823086,  0.22019681,
          0.00923647,  0.31345009,  0.00137183,  0.44826086,  0.58216228,
          0.80150029,  0.23242597,  0.8262221 ,  0.60863116,  0.08893673,
          0.41551638,  0.2028903 ,  0.39727749,  0.04931453,  0.67554068,
          0.50231428,  0.14228771,  0.75939469,  0.73203994,  0.4021312 ,
          0.95635628,  0.79385696,  0.84633566,  0.6780987 ,  0.26801553],
        [ 0.49527197,  0.10637699,  0.05502262,  0.91903351,  0.9622015 ,
          0.08998945,  0.04805187,  0.62436922,  0.52476235,  0.46012841,
          0.25960122,  0.36355046,  0.1071647 ,  0.34422459,  0.82103606,
          0.80686284,  0.64411576,  0.59057585,  0.99598974,  0.35707913,
          0.6592711 ,  0.94849029,  0.10922859,  0.86761925,  0.62135775,
          0.26693811,  0.35308017,  0.55568433,  0.58026628,  0.6231432 ],
        [ 0.00897472,  0.95677129,  0.86999587,  0.76996272,  0.3509032 ,
          0.01676494,  0.47814996,  0.26312035,  0.8424034 ,  0.38185447,
          0.66130584,  0.81543474,  0.19652524,  0.90486259,  0.2696711 ,
          0.52762892,  0.45498036,  0.93407269,  0.47691303,  0.53777889,
          0.24789253,  0.21556571,  0.98698293,  0.38105404,  0.93523   ,
          0.47578443,  0.9539223 ,  0.28418445,  0.58604529,  0.439635  ]],

       [[ 0.25362558,  0.43588435,  0.98791022,  0.65001279,  0.62653584,
          0.22877716,  0.03592334,  0.84368355,  0.98171993,  0.75104435,
          0.93219714,  0.56684988,  0.28781944,  0.59440161,  0.03669353,
          0.87419681,  0.35118677,  0.00164235,  0.65667839,  0.59846051,
          0.92194465,  0.89212112,  0.6754629 ,  0.33822562,  0.13768266,
          0.69480216,  0.53410427,  0.46497193,  0.65124322,  0.92326335],
        [ 0.19143062,  0.87126119,  0.23769379,  0.70634934,  0.84206529,
          0.31532838,  0.0359356 ,  0.73528696,  0.41761533,  0.70014432,
          0.95087524,  0.2089988 ,  0.30515848,  0.43699135,  0.46744071,
          0.64701894,  0.01311807,  0.12991783,  0.78252716,  0.92984908,
          0.36233084,  0.72139994,  0.08142149,  0.70507605,  0.39714605,
          0.62666774,  0.84570191,  0.73027034,  0.62254737,  0.29993884],
        [ 0.63354465,  0.89517783,  0.36412958,  0.40313886,  0.73661943,
          0.52006602,  0.22160775,  0.96787553,  0.48114131,  0.45861016,
          0.56110427,  0.0436747 ,  0.35392838,  0.28493785,  0.75028971,
          0.27981755,  0.47693246,  0.20471729,  0.40921215,  0.97514025,
          0.18432383,  0.50585425,  0.91759936,  0.55859866,  0.90743812,
          0.17455056,  0.8506946 ,  0.41379961,  0.44601306,  0.50890128],
        [ 0.67762599,  0.97227105,  0.89713808,  0.26348991,  0.92725944,
          0.39298646,  0.21970485,  0.23727688,  0.62154457,  0.09370568,
          0.54572018,  0.92904497,  0.16395049,  0.80788983,  0.62663261,
          0.29824172,  0.5816163 ,  0.54929812,  0.54822932,  0.63104288,
          0.21833948,  0.71231371,  0.55735062,  0.70975907,  0.99924771,
          0.19579797,  0.37217704,  0.20980616,  0.94209054,  0.6501945 ]]])

In [88]:
# Step 4 - 2) form J_bar
J_bar = [x for x in subMatrices if x not in J] # list comprehension to preserve the order!

J_bar


Out[88]:
[array([[ 0.9793168 ,  0.79684668,  0.24191554,  0.22017076,  0.88823343,
          0.55602499,  0.8784663 ,  0.47863121,  0.88873847,  0.73755523,
          0.69314583,  0.91667525,  0.94153782,  0.75222512,  0.50051605,
          0.82558156,  0.30170679,  0.2933431 ,  0.73631618,  0.24340626,
          0.57113037,  0.44625164,  0.18977816,  0.62837606,  0.94939754,
          0.21409242,  0.44220773,  0.00437552,  0.58830595,  0.99841317],
        [ 0.75872233,  0.20767405,  0.90200208,  0.7114609 ,  0.6296845 ,
          0.49036341,  0.79695558,  0.61573146,  0.30559517,  0.89296964,
          0.6746326 ,  0.18496316,  0.76393459,  0.18191098,  0.81846512,
          0.16703585,  0.2260835 ,  0.59724962,  0.02659572,  0.53095834,
          0.30361324,  0.83342973,  0.58129978,  0.3108369 ,  0.36391311,
          0.59308621,  0.65685673,  0.57341446,  0.55275973,  0.62233682],
        [ 0.04996008,  0.05903337,  0.11999273,  0.15578447,  0.02722808,
          0.09773978,  0.9360327 ,  0.35918017,  0.38285422,  0.66278199,
          0.99103608,  0.80063256,  0.97593909,  0.64157364,  0.85458966,
          0.48054315,  0.83518796,  0.50755409,  0.78545608,  0.46815288,
          0.97544199,  0.07602775,  0.19539425,  0.28621797,  0.84659385,
          0.65807282,  0.56768768,  0.18174287,  0.59830874,  0.32325366],
        [ 0.6301899 ,  0.70250514,  0.38641637,  0.65455806,  0.90166753,
          0.30808479,  0.97780122,  0.2781907 ,  0.6110496 ,  0.67570819,
          0.25638336,  0.37984173,  0.63736718,  0.05422958,  0.56391184,
          0.10047142,  0.04170306,  0.95366402,  0.43964007,  0.86767609,
          0.16399534,  0.338445  ,  0.97423617,  0.34907185,  0.02477761,
          0.78385242,  0.26652933,  0.44095795,  0.99095148,  0.161078  ]]),
 array([[  5.90642630e-01,   4.58293028e-01,   6.57607843e-01,
           5.27202202e-01,   5.51025676e-01,   4.71786420e-02,
           9.75460186e-01,   5.07370304e-01,   3.81496453e-01,
           8.19575449e-01,   4.92587224e-01,   4.59547937e-01,
           7.24198659e-02,   6.12107928e-01,   3.03782827e-01,
           1.75741093e-02,   2.25520934e-01,   9.64033526e-01,
           7.33546652e-01,   2.60414594e-01,   6.00809362e-01,
           1.28829474e-01,   2.65799156e-01,   3.02015032e-01,
           5.93816538e-01,   5.63080027e-01,   4.42377138e-01,
           5.69146905e-01,   4.52088043e-01,   6.85915396e-01],
        [  3.38190019e-01,   9.88633136e-02,   4.85365344e-01,
           2.90835328e-01,   8.60389651e-01,   1.69302307e-01,
           8.48125693e-01,   6.64687508e-01,   7.92089141e-01,
           9.32571741e-01,   4.52069646e-01,   7.75234782e-01,
           4.85221093e-01,   8.08991929e-01,   4.37171974e-01,
           8.42891968e-01,   6.37319711e-01,   2.59981718e-01,
           5.20214302e-01,   7.24707637e-01,   2.69366168e-01,
           2.34576127e-01,   2.57103994e-02,   3.54890998e-01,
           8.62024137e-01,   6.94579878e-01,   3.23376811e-01,
           5.91191251e-01,   7.05645936e-01,   9.83679887e-01],
        [  1.06517713e-04,   4.05472646e-01,   7.41305150e-01,
           8.38528473e-01,   5.36945329e-01,   8.79237174e-01,
           4.76667857e-01,   8.22726253e-01,   9.18995666e-01,
           3.99787498e-01,   5.21051680e-01,   1.31282148e-01,
           8.95466711e-01,   3.93664072e-01,   7.81152506e-01,
           4.93095101e-02,   9.11019720e-02,   7.97364194e-01,
           2.47407707e-01,   2.01443327e-01,   2.00571156e-01,
           6.99093131e-01,   6.67747514e-01,   4.35593820e-01,
           6.93677441e-02,   4.73942734e-01,   6.21677977e-01,
           8.46328462e-01,   1.01056635e-01,   9.09405859e-01],
        [  7.72420204e-01,   9.98810350e-01,   9.59423652e-01,
           5.21405848e-01,   6.84360663e-01,   8.50076261e-01,
           5.64303851e-01,   2.88584456e-01,   8.62331510e-01,
           5.76434488e-01,   4.41198087e-01,   1.03716030e-02,
           4.49733008e-01,   3.55235338e-01,   3.90953440e-01,
           4.46829738e-01,   3.23804583e-01,   4.65262792e-01,
           8.57902020e-01,   8.93339444e-01,   9.61079083e-01,
           8.75493948e-01,   1.96422810e-01,   1.08842946e-01,
           4.21685400e-01,   7.88706281e-01,   2.44868838e-02,
           1.08315787e-01,   3.06347258e-01,   5.72416081e-01]])]

In [89]:
# Step 4 - 3) form R^c
R = np.sum(np.sum(J,axis=1),axis=0) # !!! be careful here !!!

R


Out[89]:
array([ 2.96656617,  5.77712878,  4.72451781,  4.50853567,  4.95010989,
        1.99666117,  1.83395625,  4.10849516,  5.29444656,  3.50566035,
        5.4842235 ,  4.05266567,  2.45600433,  4.0635402 ,  3.57419252,
        4.60507994,  3.26524727,  3.38475985,  4.58737732,  4.90399289,
        3.96640785,  4.42995199,  4.15191805,  4.64718646,  4.90981651,
        4.18330334,  5.17872723,  4.43088064,  5.41329923,  4.06747442])

In [90]:
# Step 4 - 4) form R_bar^c
R_bar = np.sum(np.sum(J_bar,axis=1),axis=0)

R_bar


Out[90]:
array([ 4.11954848,  3.72749858,  4.49402871,  3.91994604,  5.07953486,
        3.39800736,  6.45381339,  4.01510206,  5.14315022,  5.69738423,
        4.5221045 ,  3.65854918,  5.22161935,  3.79993859,  4.65054342,
        2.9302373 ,  2.68242851,  4.83845306,  4.34707872,  4.19009857,
        4.04600671,  3.63214681,  3.09638824,  2.77584556,  4.13157594,
        4.76941279,  3.34520028,  3.31547321,  4.29546377,  5.25649887])

In [91]:
# Step 4 - 5) Determine element n*
n_star = np.where(R==np.max(R)) # !!! Test for the unlikely cases where there are 2 same max(R)!!!

n_star[0][0] # danger becasue of the case where there are 2 or more same max(R) !!


Out[91]:
1

In [92]:
# Step 4 - 6) Compute relative rank
intermediate = R_bar.argsort()
ranks = intermediate.argsort()

omega_bar = ranks[n_star[0][0]]/(N+1)

omega_bar   #Notice that the strategy is the best IS and OOS, we still get 0.5. This is because
# N is very small.


Out[92]:
0.29032258064516131

In [93]:
# Step 4 - 7) Compute logit
logit = np.log(omega_bar/(1-omega_bar))  # assert that omega_bar is in (0,1)

logit


Out[93]:
-0.89381787602209639

In [94]:
# Visualization of the ln(x/(1-x)) function:
x = np.arange(0.01,1,0.05)
y = np.log(x/(1-x))

f = plt.figure(figsize=(10,5))
ax = f.add_subplot(111)

ax.plot(x,y)


Out[94]:
[<matplotlib.lines.Line2D at 0x1e9313812b0>]

Combine all the above to produce logit distribution


In [95]:
M = np.random.rand(256,40)
S = 16

In [180]:
def produceLogits(M, S):
    T = M.shape[0]
    
    subMatrices                   = []
    all_R_bar                     = []
    performance_degradation_R     = []
    performance_degradation_R_bar = []
    
    for i in range(int(T/S)):
        subMatrices.append(M[S*i:(S*i + int(S))])
    
    combinations = itertools.combinations(subMatrices, int(S/2))

    logits = []
    for J in combinations:
        J = np.array(J)
        J_bar = [x for x in subMatrices if x not in J]

        R = np.sum(np.sum(J,axis=1),axis=0)
        R_bar = np.sum(np.sum(J_bar,axis=1),axis=0)
        
        for r in R_bar:           
            all_R_bar.append(r)  # required for stochastic dominance
        
        n_star = np.where(R==np.max(R))
        ranks = R_bar.argsort().argsort()

        performance_degradation_R.append(np.max(R))               # required for Performance Degradation
        performance_degradation_R_bar.append(R_bar[n_star[0][0]]) # required for Performance Degradation
        
        N = subMatrices[0].shape[1]
        omega_bar = ranks[n_star[0][0]]/(N+1)

        logit = np.log(omega_bar/(1-omega_bar)) #division by zero!!! FIX!!!
        logits.append(logit)

    return logits, [performance_degradation_R,performance_degradation_R_bar, all_R_bar]

In [181]:
logits, [R, R_bar, all_R_bar] = produceLogits(M, S)


C:\Users\N.vavryk\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel\__main__.py:34: RuntimeWarning: divide by zero encountered in log

In [182]:
logits = [x for x in logits if (x != -inf)]
logits = pd.DataFrame(data=logits,columns=['logits'])

In [183]:
PBO = len(logits.loc[logits.logits < 0])/len(logits.logits)

In [169]:
f = plt.figure(figsize=(20,10))
ax = f.add_subplot(111)

p = sns.regplot(np.array(R),np.array(R_bar),ax=ax)

slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(x=p.get_lines()[0].get_xdata(),y=p.get_lines()[0].get_ydata())

ax.set_title('Performance Degradation', fontsize=20)
ax.set_xlabel('SR IS', fontsize=20)
ax.set_ylabel('SR OOS', fontsize=20)

formatter = '{0:.2f}'

ax.text(np.median(np.array(R)), np.max(np.array(R_bar)),r'[SROOS]='+formatter.format(intercept)+'+'+formatter.format(slope)+'*[SRIS]+err',fontsize=25)
ax.text(np.median(np.array(R)), np.min(np.array(R_bar)),r'$P[\overline{R_{n^*}}^c < 0]\equiv$P[SROOS<0]='+formatter.format(len(np.where(np.array(R_bar)<0)[0])/len(np.array(R_bar))),fontsize=25)


Out[169]:
<matplotlib.text.Text at 0x1e926e87358>

In [174]:
f = plt.figure(figsize=(20,10))
ax = f.add_subplot(111)

minLogit = np.min(logits.logits)

sns.distplot(logits.logits, hist_kws=dict(cumulative=True),kde_kws=dict(cumulative=True),ax=ax)
ax.text(minLogit, 0.9, '[PBO]:$\sum_{n=1}^N P[\overline{r_n} < N/2 | r \in \Omega^{*}_n] P[r \in \Omega^{*}_n]=\int_{-\infty}^{0}f(\lambda) d\lambda$=' + '{:.4f}'.format(PBO),fontsize=20)
ax.text(minLogit,0.8,'[SROOS]:'+'$\overline{R_{n^*}}^c = \\alpha + \\beta R^c_{n^*}+\\epsilon^c$;$\overline{R_{n^*}}^c$='+formatter.format(intercept)+'+'+formatter.format(slope)+'*$R^c_{n^*}$+err',fontsize=20)
ax.text(minLogit,0.7,'[Prob. of loss]:'+'$P[\overline{R_{n^*}}^c < 0]$='+formatter.format(len(np.where(np.array(R_bar)<0)[0])/len(np.array(R_bar))),fontsize=20)

ax.set_title('PBO and Summary',fontsize=25)


C:\Users\N.vavryk\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\nonparametric\kdetools.py:20: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j
Out[174]:
<matplotlib.text.Text at 0x1e938f5d710>

In [199]:
f = plt.figure(figsize=(15,10))
ax = f.add_subplot(111)

best_R = np.sort(np.array(R_bar))
cdf_1 = np.arange(len(best_R))/float(len(best_R))
ax.plot(best_R, cdf_1,label=r'$\bar{R_{n^*}}$')

all_R = np.sort(np.array(all_R_bar))
cdf_2 = np.arange(len(all_R_bar))/float(len(all_R_bar))
ax.plot(all_R, cdf_2,label=r'$\bar{R}$')

ax.set_title('Stochastic Dominance', fontsize=25)

plt.legend(fontsize=25)


Out[199]:
<matplotlib.legend.Legend at 0x1e9419c2b00>

In [ ]: