In [1]:
from IPython.html.services.config import ConfigManager
from IPython.utils.path import locate_profile
cm = ConfigManager(profile_dir=locate_profile(get_ipython().profile))
cm.update('livereveal', {
'theme': 'sky',
'transition': 'zoom',
'start_slideshow_at': 'selected',
})
%load_ext tikzmagic
I would like to talk about three results:
$ \def\vec{\mathop{\mathrm{vec}}\nolimits} \def\trace{\mathop{\mathrm{tr}}\nolimits} \def\rk{\mathop{\mathrm{rank}}\nolimits} \def\Span{\mathop{\mathrm{span}}\nolimits} $ A typical way to define ``best'' rank-$r$ approximation is to formulate as a minimization problem
$$R(X) \rightarrow \min,$$where typically
$$ T_1(X) = \Vert AX + XA^{\top} - y_0 y^{\top}_0 \Vert_F. $$For $A = A^{\top} > 0$ we can use the energy functional
$$ T_2(X) = \trace X A X + \trace X y_0 y^{\top}_0. $$For $A$ being a discrete differential operators, these can be ``ill-conditioned'' functionals!
In Kolesnikov, Oseledets, arxiv 1410.3335
Motivated by the seminal paper by Y. Saad, we propose to use the following functional:
$$F(U) = \int_{0}^{\infty} \Vert y - \widehat{y} \Vert^2 dt,$$where $y = e^{At} y_0, \widehat{y} = U e^{Bt} U^{\top} y_0, B = U^{\top} A U$
Well-known in model order reduction (however for this particular case we still miss the exact reference). Consider an ODE $$\frac{dy}{dt} = Ay, \quad y(0) = y_0.$$ And look for the optimal subspace $$y(t) \approx \widetilde{y} = U c(t)$$. The error measure: $$F_0(U) = \int^{\infty}_0 \Vert y - \widetilde{y} \Vert^2 dt,$$ best $U$ can be obtained from $r$ leading eigenvectors of $X$ $$AX + XA^{\top} = -y_0 y^{\top}_0, \quad F_0(U) = \Vert X - X_r \Vert.$$
Given $U$, the optimal $c(t) = U^{\top} y(t)$ is not computable.
We replace it with the Galerkin projection:
$$\frac{dc}{dt} = B c, \quad c(0) = c_0, \quad B = U^{\top} A U, \quad c_0 = U^{\top} y_0.$$And introduce the final functional
$$F(U) = \int_{0}^{\infty} \Vert y - \widehat{y} \Vert^2 dt,$$where $y = e^{At} y_0, \widehat{y} = U e^{Bt} U^{\top} y_0, B = U^{\top} A U$
This functional can be computed
$$ F(U) = \mathrm{tr}(X) - 2 \mathrm{tr}(U^{\top}(P - UZ)) - \mathrm{tr}(Z)$$where $$B Z + Z B^{\top} = c_0 c^{\top}_0, \quad A P + P B^{\top} = y_0 c^{\top}_0, \quad c_0 = U^{\top} y_0.$$
In principle, we can use optimization over Stiefel manifold now, but we do not know the rank.
.
Theorem
If $UU^{\top} y_0 = y_0$, then
Based on this theorem, it is a good idea to add Krylov & Rational Krylov vector at each step.
Assume rank-$1$ approximation to $P_1$, with $P_1 \approx v q^{\top}$, $q = \frac{z}{\Vert z \Vert}$.
It leads to the following equation for $v$:
$$(A + (q^{\top} B^{\top} q) I) v = w.$$Note the remarkably simple formula for the new shift!
We only need to solve the small Lyapunov equation ($\mathcal{O}(r)^3$ cost) $$BZ + ZB^{\top} = c_0 c^{\top}_0$$ and take the last column of $Z$ (and normalize it).
The shifts produced by ALR lies in the range of $B = U^{\top} A U$, thus are much less spread than those for RKSM;
we can reuse the AMG hierarchy for the shifted system.
$$ \begin{array}{lllllr} \textrm{grid} &\textrm{method} &\textrm{precomp.} &\textrm{1 solver time} &\textrm{final time} &\textrm{it-s/rank}\\ &ALR & -- &0.6424 &3.2881 &6/13 \\ &ALR(AMG) &0.2231 &0.0466 &\bf{0.5020} &6/13 \\ &KPIK(LU) &0.6687 &\bf{0.0086} &0.8351 &9/19 \\ &KPIK(AMG) &0.2231 &0.0436 &0.6787 &9/19 \\ 20\times 20 \times 20 &KPIK & -- &0.8533 &7.0838 &9/19 \\ &RKSM &\bf{0.0265} &0.6610 &5.5919 &9/10 \\ &RKSM(AMG) &0.2496 &0.0491 &0.9061 &9/10 \\ &ERKSM &\bf{0.0265} &0.7303 &4.8876 &7/15 \\ &ERKSM(AMG) &0.2496 &0.0541 &0.9108 &6/13 \\ \hline &ALR & -- &9.2135 &53.2061 &7/15 \\ &ALR(AMG) &0.9827 &0.1987 &\bf{2.3924} &7/15 \\ &KPIK(LU) &10.1743 &\bf{0.0497} &11.0392 &9/19 \\ &KPIK(AMG) &0.9827 &0.2055 &3.0282 &9/19 \\ 30\times 30\times 30 &KPIK & -- &12.7750 &105.1641 &9/19 \\ &RKSM &\bf{0.1215} &9.3640 &85.8903 &10/11 \\ &RKSM(AMG) &1.1042 &0.1792 &3.2822 &10/11 \\ &ERKSM &\bf{0.1215} &9.5372 &58.2637 &7/15 \\ &ERKSM(AMG) &1.1042 &0.1609 &3.2703 &8/17 \\ \end{array} $$
Dirac-Frenkel principle can be used instead:
$$ \langle \frac{dA}{dt} - \frac{dX}{dt}, Z \rangle_F = 0, $$for all $Z$ in the tangent space of the manifold $\mathcal{M}_r$ of all matrices of rank $r$,
and tangent space is defined as
(Lubich, Koch, Meyer,...)
$$\begin{split} &\frac{dU}{dt} S = \frac{dA}{dt} V,\\ &\frac{dV}{dt} S^{\top} = \frac{dA^{\top}}{dt} U, \\ &\frac{dS}{dt} = U^{\top} \frac{dA}{dt} V. \end{split}$$Suppose, you start from rank-$1$ and evolve into rank-$10$ case: the matrix $S$ will be singular!
Theorem (O., Lubich, 2014): The equations with $P_1$, $P_2$, $P_3$ can be efficiently integrated
Step "L": $$\frac{dX}{dt} = UU^{\top}\frac{dA}{dt} \rightarrow \frac{d(VS^{\top})}{dt} = \frac{dA^{\top}}{dt} U, \frac{dU}{dt}=0,$$
Thus $L_1 = L_0 + (A_1 - A_0)^{\top} U_0, U_1 = U_0$.
Step "K": $$\frac{dX}{dt} = \frac{dA}{dt} VV^{\top} \rightarrow \frac{d(US)}{dt} = \frac{dA}{dt} V, \frac{dV}{dt}=0,$$
Thus $K_1 = K_0 + (A_1 - A_0) V_0, V_1 = V_0$
Step "S": $$\frac{dX}{dt} = UU^{\top}\frac{dA}{dt} VV^{\top} \rightarrow \frac{d(S)}{dt} = U\frac{dA}{dt} V, \frac{dV}{dt}=0, \frac{dU}{dt} = 0,$$
Thus $S_1 = S_0 \mathbf{-} U^{\top}_0(A_1 - A_0) V_0, V_1 = V_0, U_1 = U_0$
Here $A_0 = A(t), A_1 = A(t + \tau)$
The final splitting steps are
[V_1, S^{\top}_1] = QR(L_1)\end{split}$$
Step "K": $$\begin{split} K_0 = U_0 S^{\top}_0 \ K_1 = K_0 + (A_1 - A_0) V_0 \
[U_1, S_1] = QR(K_1)\end{split}$$
Step "L": $$S_1 = S_0 - U^{\top}_0 (A_1 - A_0) V_0$$ (note the minus!) The order of the steps matters
A simplest (but naive) way to generalize low-rank to many dimensions is to use sum-of-products defintion of the matrix rank:
$$ A(i_1, \ldots, i_d) \approx \sum_{\alpha=1}^r U_1(i_1, \alpha) \ldots U_d(i_d, \alpha) $$This is called canonical format and canonical rank
The canonical format can be unstable and in general can not be efficiently computed!
In 2009 active development of stable tensor formats began.
Two "similar" formats we proposed
$\def\matX {{\boldsymbol{\mathrm{X}}}} \def\matY {{\boldsymbol{\mathrm{Y}}}} \def\matZ {{\boldsymbol{\mathrm{Z}}}} \def\matC {{\boldsymbol{\mathrm{X}}}} $
Notation is hard; we use one from (Lubich, Vandreycken, O., 2014).
We can introduce left product of cores and right product of cores
$$ \matX_{\le k} (i_1,\dots,i_k,:) = \matC_1(i_1) \cdots \matC_k(i_k) $$and the right partial product $\matX_{\ge k+1} \in \mathbb{R}^{n_{k+1} \times \cdots \times n_d \times r_k}$ as
$$ \matX_{\ge k+1} (i_{k+1},\dots,i_d,:) = \matC_{k+1}(i_{k+1} \cdots \matC_d(i_d). $$The k-th unfolding satisfies
$$\mathrm{Ten}_k(X) = \matX_{\leq k} \matX_{\geq k+1}^{\top}$$and there is a recursive connection
$$\matX_{\ge k} = ( \matX_{\ge k+1} \otimes \matI_{n_k} ) \matC_K^>$$This forms the basis for the orthogonalization:
Given QR decomposition of $\matX_{\ge k}$ it is very simple
to orthogonalize $\matX_{\ge k+1}$ just by orthogonalizing the next core from left to right.
$ \matP_{\le k} = \matQ_{\le k}\matQ_{\le k}^\top, \qquad\hbox{and}\qquad \matP_{\ge k} = \matQ_{\ge k}\matQ_{\ge k}^\top, $
We compare the TT-KSL scheme with the MCTDH package for the benchmark problem with Henon-Heiles potential \begin{equation} \frac{d\psi}{dt} = i H \psi, \quad \psi(0) = \psi_0, \end{equation} where $H$ has the form \begin{equation} \def\Hlap{-\frac12 \Delta} \def\Hhar{\frac12 \sum_{k=1}^f q^2_k} \def\Hanh{\sum_{k=1}^{f-1}\left(q^2_k q_{k+1} - \frac13 q^3_{k+1} \right)} H = \Hlap + \underbrace{\Hhar + \overbrace{\lambda \Hanh}^{\textrm{anharmonic part}}}_{\textrm{Henon-Heiles potential}~V(q_1,\ldots,q_f)}. \end{equation}
The projector-splitting scheme has high potential for optimization, since it is discrete.
Given a process $x_{k+1} = x_k + F_k$, we create a manifold version
$$x_{k+1} = I(x_k, F_k).$$What about the convergence?
Typical estimates use the curvature of the manifold; for low-rank matrices the curvature is given by the condition number,
I.e. can be arbitrary high. In practice, it makes no difference.
$$x = Qx + f, \quad \Vert Q \Vert \leq \delta, $$
Solution $x^*$ has rank $5$ with $3$ large singular values, one $10^{-3}$ of the largest, one $10^{-6}$ of the largest.
Blue : original gradient method, Green : Manifold-projected gradient, Cyan : The normal component of the error
Software
In [2]:
from IPython.core.display import HTML
def css_styling():
styles = open("custom.css", "r").read()
return HTML(styles)
css_styling()
Out[2]:
In [ ]: