In many applications, the linear transformation is described by a shift-invariant integral operator, also called convolution: $$ f(x) = \int_{\Pi} K(x -y) u(y) dy, $$ where $\Pi$ is a square in $\mathbb{R}^d$.
Example applications:
After the discretization on a uniform grid, we get discrete convolution:
$$
y_i = \sum_{j} t_{i - j} x_j,
$$
and the matrix $T_{ij} = t_{i - j}, i, j = 1, \ldots, n$ is called a Toeplitz matrix: on each diagonal we have the same value.
For $4 \times 4$ case it looks as follows: $$ T = \begin{pmatrix} t_0 & t_{-1} & t_{-2} & t_{-3} \\ t_1 & t_0 & t_{-1} & t_{-2} \\ t_2 & t_1 & t_0 & t_{-1} \\ t_3 & t_2 & t_1 & t_0 \end{pmatrix}, $$ and the matrix is completely defined by 2n-1 parameters (first column, first row minus the first element).
Now, let us do a short demo on how the Toeplitz matrices can be used.
Consider a one-dimensional signal
In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
n = 128
u = np.zeros(n)
u[9] = 0.25
u[10] = 0.5
u[11] = 0.25
u[70] = 1
plt.plot(u)
Out[1]:
Do some blurring of the signal
In [7]:
c = 1
T = [[np.exp(-0.1 * (i - j) ** 2) for i in xrange(n)] for j in xrange(n)]
T = c * np.array(T)
y = T.dot(u)
p = np.arange(n)
plt.plot(p, y, label='blurred')
plt.plot(p, u, label='original')
plt.legend(loc='best')
Out[7]:
Now add some noise
In [9]:
e = np.random.randn(n)
w = y + 1e-3 * e
plt.plot(w)
plt.plot(y)
Out[9]:
Now let us do the "deblurring" (we assume, that we know the matrix T).
In [10]:
urec = np.linalg.solve(T, w)
plt.plot(urec)
Out[10]:
It fails completely! What to do? Use SVD approximation to the matrix T.
In [16]:
u1, s1, v1 = np.linalg.svd(T)
r = 70
u1 = u1[:, :r]
s1 = s1[:r]
v1 = v1[:r, :]
urec_svd = v1.T.dot(u1.T.dot(w) / s1)
plt.plot(urec_svd)
plt.plot(u)
Out[16]:
There is a well-known result that convolution is a product in the Fourier domain:
$$f = g * h, \quad \mbox{or } f(x) = \int g(x) h(x - y) dy,$$
$$\widehat{f} = \widehat{g} \widehat{h},$$
where
$$
\widehat{f}(k) = \int e^{i k x} f(x) dx.
$$
is the Fourier transform of a function.
Thus, the algorithm can look as follows:
But we have to work on the discrete level, so we might replace the continious Fourier transform by the discrete Fourier transform (DFT).
But a Toeplitz matrix is defined by $(2n-1)$ parameters!
To apply Fast Fourier Transform (FFT) we have to consider a subclass of Toeplitz matrices:
circulant matrices.
A circulant matrix is a Toeplitz matrix with
$$
t_{n - i} = t_i.
$$
Example:
$$
C = \begin{pmatrix}
c_0 & c_3 & c_2 & c_1 \\
c_1 & c_0 & c_3 & c_2 \\
c_2 & c_1 & c_0 & c_3 \\
c_3 & c_2 & c_1 & c_0
\end{pmatrix}.
$$
A circulant matrix is uniquely defined by its first column and has $n$ independent parameters.
Indeed, $$ y = Cx = F^* \Lambda F x = F^* \Big( \mathrm{diag}(Fc) \Big) Fx, $$ so you see a direct analogue with the continious case:
A quick demo on the spectral theorem
In [34]:
import numpy as np
import scipy.linalg
n = 50
b = np.random.randn(n)
mm = scipy.linalg.circulant(b)
mm1 = np.fft.ifft(mm, axis = 0)
mm1 = np.fft.fft(mm1, axis = 1)
#mm1 = np.fft.fft2(mm)
print np.linalg.norm(mm1 - np.diag(np.diag(mm1)))
w1 = np.linalg.eigvals(mm)
w2 = np.fft.ifft(b) * n
plt.plot(w1.real, w1.imag, ls='', marker='x')
plt.plot(w2.real, w2.imag, ls='', marker='.')
Out[34]:
or $$ C = \begin{bmatrix} T & * \
* & *
\end{bmatrix},
$$
i.e. $T$ is a submatrix of a circulant matrix, and
$$
C \begin{bmatrix}
q \\
0
\end{bmatrix} = \begin{bmatrix}
T q \\
*
\end{bmatrix}
$$
In [37]:
n = 512
a = np.random.randn(2*n)
b = np.random.randn(2*n-1)
%timeit np.fft.fft(a)
%timeit np.fft.fft(b)
The situation is the same: embed into two-level circulant matrix (of size $[(2n-1) \times (2n-1)]^2$), do two-dimensional FFT, multiply and do inverse two-dimensional FFT.
A two-dimensional FFT is simple: it it just Kronecker product of two one-dimensional FFT. $$ F_2 = F \otimes F. $$ In practice, it means "apply FFT to columns, and apply FFT to rows".
Inverse of Toeplitz matrix is not a Toeplitz matrix!
In [41]:
import numpy as np
import scipy
from scipy.linalg import toeplitz
n = 6
c = np.arange(n)
t = scipy.linalg.toeplitz(c**3)
print t
ii = np.linalg.inv(t)
print 'Inverse:'
print ii
Introduce a shift matrix $$ Z_f = \begin{bmatrix} 0 & 0 & 0 & f \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ \end{bmatrix}. $$
For any Toeplitz matrix $T$ it holds that the matrix $$ M = L(T) = Z_f T - T Z_e $$ has rank 2. This is the definition of displacement rank: for a fixed pair $Z_f$, $Z_e$, $L(T)$ has rank at most $r$.
therefore
$$
T^{-1} Z_f - Z_e T^{-1} = T^{-1} U V^{\top} T^{-1},
$$
and the right-hand side has rank at most $r$. Moreover, if the right-hand side is given,
it is easy to recover the elements of the matrix, or even compute matrix-by-vector products.
Further reading is a paper by Victor Pan
This is not an inversion algorithm, it is only the structure of the inverse.
Gohberg-Semencul algorithm was the first work on the subject and presented an $\mathcal{O}(n^2)$ algorithm.
Superfast algorithms ($\mathcal{O}(n \log n)$) are typically quite complicated, not stable and require the recursion.
Displacement structure of the Cauchy-like matrix is simple:
$$
x_i a_{ij} - a_{ij} y_j = b_{ij} (x_i - y_j),
$$
i.e. $L(A) = D_1 A - A D_2$.
We can transform generators from one case to another. If we diagonalize $Z_e, Z_f$, we get the diagonal matrices,
i.e. by FFT we can reduce the Toeplitz matrix to a Cauchy-like matrix.
Good news about the Cauchy-like matrix is that there is an explicit inverse formula,
$$
a_{ij} = \frac{b_{ij}}{(x_i - y_j)},
$$
which also allows for pivoting
(permuting rows and columns of a Cauchy-like matrix leaves it a Cauchy-like matrix).
In [42]:
import numpy as np
import scipy.sparse
n = 100
a = [[1.0 / (i - j + 0.5) for i in xrange(n)] for j in xrange(n)]
a = np.array(a)
#ex = np.ones(n);
#a = scipy.sparse.spdiags(np.vstack((ex, -2*ex, ex)), [-1, 0, 1], n, n, 'csr');
#a = a.todense()
mm1 = np.fft.ifft(a, axis=0)
mm1 = np.fft.fft(mm1, axis=1)
In [43]:
%matplotlib inline
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
#fig, ax = ppl.subplots(111, projection='3d')
t = np.arange(n)
x, y = np.meshgrid(t, t)
ax.plot_surface(x, y, mm1.real, rstride=10, cstride=10)
ax.autoscale_view(tight=True)
ax.view_init(elev=10., azim=40)
fig.tight_layout()
#X, Y, Z = axes3d.get_test_data(0.05)
#ax.plot_wireframe(X, Y, Z, rstride=10, cstride=10)
The singularity is along the diagonal!
Off the diagonal we have nice and low-rank blocks!
In [44]:
b11 = mm1[n/2:, :][:, :n/2]
s = np.linalg.svd(b11)[1]
plt.semilogy(s)
Out[44]:
In fact there is more: every block outside the main diagonal has low rank. This class is called quasiseparable matrices. There exist exact $\mathcal{O}(n)$ algorithms for their inversion (and the inverse of a quasiseparable matrix). Thus, the algorithm:
For multilevel Toeplitz matrices there is not a lot known about the inversion.
The best known algorithms have complexity $\mathcal{O}(N^{3/2})$. There is no explicit representation of the inverse.
However, there are a lot of possibilities for the iterative algorithms, since matrix-by-vector product is fast.
In [ ]:
from IPython.core.display import HTML
def css_styling():
styles = open("./styles/custom.css", "r").read()
return HTML(styles)
css_styling()
Out[ ]: