Before we go into the implementation, a bit of theory on Monte Carlo and linear algebra, and in particular the eigenvalue / eigenvector decomposition. Assume we have a Standard Gaussian vector $Z=(Z_i)$ and we want a general Gaussian vector $X=(X_i)$ with correlation matrix $C = (C_{ij})$.
We know that there are vectors $e^i=(e^i_j)$ such that $e^i \cdot C = \lambda_i e^i$, the so called eigenvectors, with the $\lambda_i$ being the eigenvalues (we use row vectors, hence we multiply from the left). We know that those eigenvectors are orthonormal (they are orthogonal actually, but we can choose them to be of unit length), ie $e^i \cdot e^j = \delta_{ij}$ where $\delta$ is the well known Kronecker delta.
We now take our Standard Gaussian vector $Z=(Z_i)$ and we define the vector $X=(X_i)$ through $$ X = \sum_\mu \sqrt{\lambda_\mu} e^\mu Z_\mu $$ We can compute the covariance of the $X$ that we want to call $\bar{C}$ for the time being $$ \bar{C}_{ij} = E[X_i X_j] = \sum_{\mu\nu} \sqrt{\lambda_\mu \lambda_\nu} e^\mu_i e^\nu_j E[Z_\mu Z_\nu]=\sum_{\mu}\lambda_\mu e^\mu_i e^\mu_j $$ We now multiply the vector $e^i$ from the left $$ (e^i \bar{C})_j = \sum_\nu e^i_\nu \bar{C}_{\nu j} = \sum_{\nu\mu}\lambda_\mu e^i_\nu e^\mu_\nu e^\mu_j = \sum_\mu \lambda_\mu \delta_{i\mu} e^\mu_j = \lambda_i e^i_j $$ and we find that the matrix $\bar{C}$ satisfies for all $e^i$ the above eigenvector equation $e^i \cdot \bar{C} = \lambda_i e^i$. Because the $e^i$ form a basis we know that $\bar{C}=C$.
First we generat a covariance matrix. This is not entirely trivial - the matrix must be symmetric and positive definite - and one way going about is to simply write $C = R^tR$ where $R$ is any random matrix (note that this is not a particularly good covariance matrix, because it is pretty close to the one-systemic-factor model)
In [1]:
import numpy as np
In [2]:
d = 3
R = np.random.uniform(-1,1,(d,d))+np.eye(d)
C = np.dot(R.T, R)
#C = np.array(((5,2,3),(2,5,4),(3,4,5)))
C
Out[2]:
We are given a covariance matrix $C$ and we want to find its eigenvalues and eigenvectors. In Python the function that does this is scipy.linalg.eigh()
. It returns a tuple, the first component being a row-vector containing the eigenvalues, and the second one being a matrix whose columns correspond to the eigenvectors (which we transpose, ie in evm
the eigenvectors are in rows!).
In [3]:
from scipy.linalg import eigh
lam, evm = eigh(C)
evm = evm.T
lam, evm
#np.dot(evm[0],C), lam[0] * evm[0]
Out[3]:
In [4]:
N = 10000
z = np.random.standard_normal((N, d))
z
Out[4]:
We have matrix of row-vectors $z$. Each row corresponds to one draw of all random variables, and each column corresponds to all draws of one random variable. We also have the matrix of eigenvectors, where every row corresponds to one eigenvector. We also construct a matrix that on the diagonal has the square-roots of the eigenvalues $$ lm = \begin{pmatrix} \sqrt{\lambda_0} \\ & \sqrt{\lambda_1} \\ & & \ddots \\ & & & \sqrt{\lambda_{d-1}} \end{pmatrix} $$
We then compute $x$ from the $z$ as $$ x = z.lm.evm $$
In [5]:
lm = np.diag(sqrt(lam))
lm
Out[5]:
In [6]:
x = np.dot (z,lm)
x = np.dot (x, evm)
x
Out[6]:
In [7]:
C1 = np.cov(x, rowvar=0, bias=1)
C1, C, sort(eigvalsh(C1))[::-1],sort(eigvalsh(C))[::-1]
Out[7]:
(c) Stefan Loesch / oditorium 2014; all rights reserved (license)
In [8]:
import sys
print(sys.version)