# Lecture 11: Matrix functions

## Syllabus

Week 1: Python intro
Week 2: Matrices, vectors, norms, ranks
Week 3: Linear systems, eigenvectors, eigenvalues
Week 4: Singular value decomposition + test + homework seminar
Week 5: Sparse & structured matrices
Week 6: Iterative methods, preconditioners, matrix functions
Week 7: Test week
Week 8: App Period

## Recap of the previous lecture

• Concept of iterative methods
• Richardson iteration, Chebyshev acceleration
• Idea of Krylov methods
• Conjugate gradient methods for symmetric positive definite matrices
• Generalized minimal residual methods
• The Zoo of iterative methods: BiCGStab, QMR, IDR, ...
• The idea of preconditioners
• Jacobi/Gauss-Seidel methods Incomplete ILU

## Today lecture

Today we will talk about matrix functions.

• What is a matrix function
• Matrix exponential
• (Some) applications

Book to read: Nick Higham, Functions of matrices

## Where matrix functions come from

• Solution of PDEs and ODEs (matrix exponential, matrix sine function, matrix cosine function, square root,
sign function)
• Matrix exponential has application in localized PageRank, graph centrality
• Quantum physics
• Inverse is matrix function as well!

## What is a matrix function?

A matrix function is a function of a matrix. A matrix function is not a function, applied to the elements!

I.e., $B = \sqrt{A}$ does not mean that $B_{ij} = \sqrt{A}_{ij}$.



In [3]:

import numpy as np
import scipy.linalg as sla

A = 2 * np.eye(2)
A = np.array([[4, 4], [4, 4]])
sla.funm(A, lambda x: np.sqrt(x) )




Out[3]:

array([[ 1.41421356,  1.41421356],
[ 1.41421356,  1.41421356]])



## The simplest matrix function: matrix polynomial

It is very easy to define a matrix polynomial as
$$P(A) = \sum_{k=0}^n c_k A^k.$$ Side-note: Hamilton-Cayley theorem states that $F(A) = 0$ where $F(\lambda) = \det(A - \lambda I)$.

## Matrix polynomials as building blocks

We can define a function of the matrix by Taylor series:
$$f(A) = \sum_{k=0}^{\infty} c_k A^k.$$ The convergence is understood as the convergence in some matrix norm.
Example of such series is the Neumann series
$$(I - F)^{-1} = \sum_{k=0}^{\infty} F^k,$$ which is well defined for $\rho(F) < 1$.

## Matrix exponential series

The most well-known matrix function is matrix exponential. In the scalar case,
$$e^x = 1 + x + \frac{x^2}{2} + \frac{x^3}{6} + \ldots = \sum_{k=0}^{\infty} \frac{x^k}{k!},$$ and it directly translates to the matrix case:
$$e^A = \sum_{k=0}^{\infty} \frac{A^k}{k!},$$ the series that always converges (why?).

## Why matrix exponential is important

A lot of practical problems are reduced to a system of linear ODEs of the form
$$\frac{dy}{dt} = Ay, \quad y(0) = y_0.$$ The formal solution is given by $y(t) = e^{At} y_0$, so if we know
$e^{At}$ (or can compute matrix-by-vector product fast) there is a big gain over the
time-stepping schemes.

## How to compute matrix functions?

The simplest way is to diagonalize the matrix:
$$A = S \Lambda S^{-1}.$$ where the columns of $S$ are eigenvectors of the matrix $A$, then

$$F(A) = S F(\Lambda) S^{-1}.$$

Problem: diagonalization can be unstable! (and not every matrix is diagonalizable)

Let us look how matrices are diagonalizable:



In [21]:

import numpy as np
eps = 0.5
p = 4
a = np.eye(p)
for i in xrange(p-1):
a[i, i+1] = 1

a[p-1, 2] = eps
a = np.array(a)
val, vec = np.linalg.eig(a)
#print a
print np.linalg.norm(a - vec.dot(np.diag(val)).dot(np.linalg.inv(vec)))
#print 'S * D * S^{-1}:'
#print vec.dot(np.diag(val)).dot(np.linalg.inv(vec))
print a




2.2360679775
[[ 1.   1.   0.   0. ]
[ 0.   1.   1.   0. ]
[ 0.   0.   1.   1. ]
[ 0.   0.   0.5  1. ]]



Now we can compute a function



In [23]:

import numpy as np
eps = 1e-10
p = 5
a = np.eye(p)
for i in xrange(p-1):
a[i, i+1] = 1

a[p-1, 0] = eps
a = np.array(a)
val, vec = np.linalg.eig(a)
print np.linalg.norm(a - vec.dot(np.diag(val)).dot(np.linalg.inv(vec)))

fun = lambda x: np.exp(x)

#Using diagonalization
fun_diag = vec.dot(np.diag(fun(val))).dot(np.linalg.inv(vec))

#Using Schur
import scipy
fun_m = scipy.linalg.expm(a)
np.linalg.norm(fun_m - fun_diag)




4.84290066277e-08

Out[23]:

1.3563815727960955e-07



## How funm function works

The exponential of a matrix is a special function, so there are special methods for its computation. For a general function,
there is a beautiful Schur-Parlett algorithm, which is based on the Schur theorem

## Schur-Parlett algorithm

Given a matrix $A$ we want to compute $F(A)$, and we only can evaluate $F$ at scalar points.
First, we reduce $A$ to the triangular form as
$$A = U T U^*.$$

Therefore, $F(A)=U F(T) U^*$

## Computing functions of triangular matrices

We know values on the diagonals:
and $$F_{ii} = F(T_{ii}),$$ and also we know that $$F T = T F,$$ and we get

$$f_{ij} = t_{ij} \frac{f_{ii} - f_{jj}}{t_{ii} - t_{jj}} + \sum_{k=i+1}^{j-1} \frac{f_{ik} t_{kj} - t_{ki}f_{kj}}{t_{ii} - t_{jj}}.$$

Matrix exponential is well approximated by rational function:

$$\exp(x) \approx \frac{p(x)}{q(x)},$$

where $p(x)$ and $q(x)$ are polynomials and computation of a rational function of a matrix is reduced to matrix-matrix products and matrix inversions.
The rational form is also very useful when only a product of a matrix exponential by vector is needed, since
evaluation reduces to matrix-by-vector products and linear systems solvers

A short demo on Pade approximation via sympy package



In [31]:

import sympy.mpmath
%matplotlib inline
from sympy.mpmath import pade, taylor, polyval
import matplotlib.pyplot as plt
x = np.linspace(0, 1, 128)
a = taylor(sympy.mpmath.exp, 0, 100)
p, q = pade(a, 4, 2)
plt.plot(x, polyval(p[::-1], x)/polyval(q[::-1], x) - np.exp(x))




Out[31]:

[<matplotlib.lines.Line2D at 0x111e40390>]



## Application to graph centrality

One of the recent application is the computation. This example was taken from here. Take the following network.



In [32]:

import networkx as nx
import numpy as np
%matplotlib inline
Adj = np.array([[0, 1, 1, 0, 0, 0],
[1, 0, 1, 1, 1, 0],
[1, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 1],
[0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0]])

nx.draw(G, node_color='y', node_size=1000, with_labels=True )






One measure of the importance of each node is its centrality. We can count the number of paths of different lengths from $i$ to $i$,
and add them up to get centrality of the node $$c(i) = \alpha_1 A_{ii} + \alpha_2 A^2_{ii} + \alpha_3 A^3_{ii} + \cdots,$$ where the coefficients $\alpha_k$ remain to be chosen.
With $\alpha_k = \frac{1}{k!}$ we get
$$c = \mathrm{diag}(\exp(A))$$



In [33]:

nodeorder = np.argsort(centralities)[::-1]

print np.array([nodeorder, centralities[nodeorder]])

# Note: This is already built into networkx using the following command
# print nx.communicability_centrality_exp(G)




[[ 1.          0.          2.          3.          4.          5.        ]
[ 4.44723536  2.86427609  2.86427609  2.36018456  1.71615913  1.59432922]]



## Some other matrix functions

• Sign function: $F(A) = \mathrm{sign}(A)$ (no Taylor series), used for spectrum partititioning problems.
Iteration to compute sign-function: $$A_{k+1} = \frac{1}{2} (A_k + A^{-1}_k), A_0 = A$$
• Cosine/sine functions, used to solve $$\frac{d^2 y}{dt^2} = A y$$
• Square root $\sqrt{A}$, used to solve Lyapunov equations
$$A X + X A^{\top} = B,$$ and Lyapunov equation is used in control theory, model order reduction and many other things.

## Take home message

• Matrix exponential
• Schur-Parlett
• Graphs & PDEs

## Next lecture

We are done!
Friday - homework seminar

##### Questions?


In [29]:

from IPython.core.display import HTML
def css_styling():
return HTML(styles)
css_styling()




Out[29]:

@font-face {
font-family: "Computer Modern";
src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');
}
div.cell{
/*width:80%;*/
/*margin-left:auto !important;
margin-right:auto;*/
}
h1 {
font-family: 'Alegreya Sans', sans-serif;
}
h2 {
font-family: 'Fenix', serif;
}
h3{
font-family: 'Fenix', serif;
margin-top:12px;
margin-bottom: 3px;
}
h4{
font-family: 'Fenix', serif;
}
h5 {
font-family: 'Alegreya Sans', sans-serif;
}
div.text_cell_render{
font-family: 'Alegreya Sans',Computer Modern, "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif;
line-height: 1.2;
font-size: 160%;
/*width:70%;*/
/*margin-left:auto;*/
margin-right:auto;
}
.CodeMirror{
font-family: "Source Code Pro";
font-size: 90%;
}
/*    .prompt{
display: None;
}*/
.text_cell_render h1 {
font-weight: 200;
font-size: 50pt;
line-height: 110%;
color:#CD2305;
margin-bottom: 0.5em;
margin-top: 0.5em;
display: block;
}
.text_cell_render h5 {
font-weight: 300;
font-size: 16pt;
color: #CD2305;
font-style: italic;
margin-bottom: .5em;
margin-top: 0.5em;
display: block;
}

li {
line-height: 110%;
}
.warning{
color: rgb( 240, 20, 20 )
}

MathJax.Hub.Config({
TeX: {
extensions: ["AMSmath.js"]
},
tex2jax: {
inlineMath: [ ['$','$'], ["\$","\$"] ],
displayMath: [ ['$$','$$'], ["\$","\$"] ]
},
displayAlign: 'center', // Change this to 'center' to center equations.
"HTML-CSS": {
styles: {'.MathJax_Display': {"margin": 4}}
}
});




In [ ]: