iPython Cookbook - Monte Carlo II

Generating a Monte Carlo vector using Cholesky Decomposition

Theory

Before we go into the implementation, a bit of theory on Monte Carlo and linear algebra, and in particular the Cholesky decomposition. Firstly Monte Carlo: assume we have a Standard Gaussian vector $Z=(Z_i)$ and we want a general Gaussian vector $X=(X_i)$ with correlation matrix $C = (C_{ij})$. Because everything is linear we know that $X$ will be of the form $X = Z \cdot M$ with some matrix $M$ (we assume the $X$ have zero expectation for simplicity; also we multiply vectors from the left because we prefer row vectors).

Assuming that we have such matrix M what is the corresponding correlation matrix $C^M$? In order to find out we need to calculate $$ E[X_iX_j] = \sum_{\mu\nu} M_{\mu i} M_{\nu j} E[Z_\mu Z_\nu] = \sum_{\mu} M_{\mu i} M_{\mu j} = (M^tM)_{ij} $$ ie we find that the covariance matrix is the square of $M$ (in the matrix sense, ie $M^tM$ where $M^t$ is the transposed matrix, ie with rows and columns interchanged). So if we want to generate a vector $X$ with covariance $C$ what we effectively need is a matrix $M$ such that $C= M^tM$.

Enter Cholesky decomposition: in a nutshell, if a matrix $M$ is symmetric and positive definite (with strictly positive eigenvaluesl; as every non-degenerate covariance matrix is) then there is a unique decomposition $C = L^tL$ with the $L$ being a triangular matrix.

Implementation

Generating a covariance matrix

First we generat a covariance matrix. This is not entirely trivial - the matrix must be symmetric and positive definite - and one way going about is to simply write $C = R^tR$ where $R$ is any random matrix (note that this is not a particularly good covariance matrix, because it is pretty close to the one-systemic-factor model)


In [1]:
import numpy as np

In [2]:
d = 4
R = np.random.uniform(-1,1,(d,d))+np.eye(d)
C = np.dot(R.T, R)
#C, sort(eigvalsh(C))[::-1]

Decomposing the covariance matrix

We are given a covariance matrix $C$ and we want to find a matrix $M$ such that $M^tM=C$. One such matrix is given by running a Cholesky decomposition that is implemented in Python as scipy.linalg.cholesky()


In [3]:
from scipy.linalg import cholesky
M = cholesky(C)
M


Out[3]:
array([[ 1.18580307,  0.35238349,  1.1148574 , -0.43031757],
       [ 0.        ,  0.93486624, -1.55116164,  0.35624587],
       [ 0.        ,  0.        ,  1.22910763,  1.02804289],
       [ 0.        ,  0.        ,  0.        ,  1.17002493]])

Generating $z$

We now generate our Standard Gaussian $z$, as usual one row being one observation ($N$ is the number of rows)


In [4]:
N = 10000
z = np.random.standard_normal((N, d))
z


Out[4]:
array([[-0.7823201 , -0.12360443, -0.96780842,  0.13517212],
       [ 0.53335478, -0.81588521, -0.60080122,  0.67694351],
       [-1.21479747,  0.33155255,  0.42321467,  0.56672363],
       ..., 
       [-1.27686341, -0.93916387, -0.07284885, -0.36030967],
       [ 0.27073261,  0.95586545, -0.4922071 ,  0.20966966],
       [-0.02958598,  0.16729888, -0.97722575, -0.57274249]])

Generating $x$

We now generate our $x$ by multiplying $z \cdot M$


In [5]:
x = np.dot(z,M)
x


Out[5]:
array([[-0.92767758, -0.3912303 , -1.86998562, -0.54418129],
       [ 0.63245373, -0.57479812,  1.12173499, -0.34577631],
       [-1.44051057, -0.11811729, -1.34844116,  1.73902654],
       ..., 
       [-1.51410855, -1.32793819, -0.05626472, -0.28157953],
       [ 0.32103556,  0.98900805, -1.78584907, -0.03666916],
       [-0.03508315,  0.14597646, -1.49360738, -1.60242208]])

Check

We now check that the ex-post covariance matrix $C1$ is reasonably close to the ex-ante matrix $C$


In [6]:
C1 = np.cov(x, rowvar=0, bias=1)
C1, C, sort(eigvalsh(C1))[::-1],sort(eigvalsh(C))[::-1]


Out[6]:
(array([[ 1.35513285,  0.38370626,  1.32505864, -0.47565306],
        [ 0.38370626,  0.95954664, -1.01692056,  0.21841181],
        [ 1.32505864, -1.01692056,  5.12114778,  0.27754752],
        [-0.47565306,  0.21841181,  0.27754752,  2.82800237]]),
 array([[ 1.40612893,  0.41785743,  1.32200133, -0.51027189],
        [ 0.41785743,  0.99814902, -1.05727131,  0.18140543],
        [ 1.32200133, -1.05727131,  5.15971501,  0.23123771],
        [-0.51027189,  0.18140543,  0.23123771,  2.73791486]]),
 array([ 5.70069873,  2.96750667,  1.49782554,  0.0977987 ]),
 array([ 5.74687105,  2.89822579,  1.55892866,  0.09788232]))

Licence and version

(c) Stefan Loesch / oditorium 2014; all rights reserved (license)


In [7]:
import sys
print(sys.version)


3.4.0 (default, Apr 11 2014, 13:05:11) 
[GCC 4.8.2]