In [18]:
from IPython.html.services.config import ConfigManager
from IPython.utils.path import locate_profile
cm = ConfigManager(profile_dir=locate_profile(get_ipython().profile))
cm.update('livereveal', {
              'theme': 'sky',
              'transition': 'zoom',
              'start_slideshow_at': 'selected',
})


Out[18]:
{u'scroll': True,
 u'start_slideshow_at': 'selected',
 u'theme': 'sky',
 u'transition': 'zoom'}

In [ ]:


In [ ]:

Tensor train(s) for quantum chemistry

Ivan Oseledets, Skolkovo Institute of Science and Technology
oseledets.github.io, i.oseledets@skoltech.ru

Plan of the talk

Tell the story about what we tried to do in quantum chemistry in the last 5+ years.

Why we started doing numerical tensor methods

Because we worked with matrices and their low-rank approximations, and one day asked a question:

What happens, if we take $a_{ijk}$ instead of $a_{ij}$?

Which quantum chemistry problems can be considered by numerical tensor methods

  • Stationary or instationary Schrodinger equation
$$ H = -\frac{1}{2} \Delta + V, $$$$ H \psi_i = E_i \psi_i.$$

(Several low-lying eigenstates, vibrational problems)

  • $$\frac{d\psi}{dt} = i H \psi, \quad \psi(0) = \psi_0, \quad \psi = \psi(x, t) \quad x \in \mathbb{C}^d.$$
  • Approximation of PES itself by sampling
  • Even single-point PES evaluation (electronic structure) can be done by tensors!

Main tool

The main tool is the tensor-train format which has the form (similar to the CP-format)

$$A(i_1, \ldots, i_d) = G_1(i_1) G_2(i_2) \ldots G_d(i_d), $$

where $G_k(i_k)$ is $r_{k-1} \times r_k$, $r_0 = r_d = 1$.

Known as matrix product states to represent spin wavefunctions for a long time.

Also known as linear tensor network

Good things about TT-format

  • Can be computed by sequential singular value decompositions (SVD)
  • Always works when sum-of-products works ($r_k \leq r$).
  • The convergence of alternating least squares is way better
  • (Typically) simpler to implement than H-Tucker, cycles

TT and SVD

The rank $r_k$ are the ranks of the unfoldings

$$A(i_1 \ldots i_k; i_{k+1} \ldots i_d).$$

The algorithm is constructive (and very simple, takes 20-30 lines of code).

Basic operations

  • We can do basic linear algebra operations in $\mathcal{O}(dnr^{\alpha})$
  • We can do rank reduction in $\mathcal{O}(dnr^3)$ operations
  • We can recover the tensor exactly by sampling

Now we go to the story, and start from electronic structure computations.

Tensor-based electronic structure computations

Joint work with Maxim Rakhuba http://arxiv.org/pdf/1508.07632.pdf

We consider Hartree-Fock and DFT equations.

$$H(\Psi) \Psi = \Psi \Lambda.$$

Our approach:

  • Fully $n \times n \times n$ grid based -- controlled basis set error.
  • Real space orbitals are in the Tucker format $\Longrightarrow$ linear in 1D grid size complexity $\mathcal{O}(n^3) \rightarrow \mathcal{O}(n)$
  • Fast 3D convolutions via cross approximation

Some details for our grid-based electronic structure solver

  • Preconditioned direct minimization procedure -- good for tensor formats
  • Derivative free calculations of the Fock matrix
  • DIIS procedure
  • Aitken's extrapolation

Accuracy and extrapolation

Experiments: Accuracy for He

For Helium HF-limit $E = -2.861\ 679\ 996$ . We get

Experiments: Molecules

Experiments: Clusters

For clusters we observed that ranks grow sublinearly with the system size:

  • H$_{3\times 2 \times 2}$: highest ranks $= 35\times35\times 20$
  • H$_{9\times 2 \times 2}$: highest ranks $= 36\times36\times 24$

Experiments: Factors (similar to basis functions in MCTDH) behavior

Sampling low-rank functions

Given a functions $f(x_1, \ldots, x_d)$ that can be represented as a sum-of-products, what can we say about its recovery from samples?

Warning

We heard about canonical polyadic (CP) CP-format:

$$A(i_1, \ldots, i_d) = \sum_{\alpha=1}^r A_1(i_1, \alpha) \ldots A_d(i_d, \alpha).$$

CP-format can be hard to fit: although for a given tensor it might be easy to fit, in general it is an NP-hard problem even to determine the rank!

Thus, robust sampling in the CP-format can be tricky.

In TT-format is absolutely different!

Sampling in the TT-format

Theorem (O., Tyrtyshnikov, 2010): A rank-$r$ tensor can be exactly recovered from $\mathcal{O}(dnr^2)$ elements.

The algorithm named TT-cross was also proposed.

The name "cross" comes from matrix case

In general, rank-$r$ matrix can be exactly recovered from $r$ columns and $r$ rows. (It is Gaussian elimination in disguise).

How do we find "good columns"

The guiding principle in finding good columns is maximum-volume principle:

If $\widehat{A}$ has the largest possible absolute value of the determinant,

then (Goreinov, Tyrtyshnikov)

$$\Vert A - A_{skel} \Vert_C \leq (r+1)^2 \Vert A - A_{best} \Vert_C.$$

It is related to the (quasi)-optimal interpolation points problem for the given basis set.

PES sampling

  • We have a cross-approximation code in our software package TT-Toolbox (both MATLAB and Python versions available)

  • Coupling to electronic structure software is technically difficult

V. Baranov, I. Oseledets, Fitting high-dimensional potential energy surface using active subspace and tensor train (AS+TT) method, J. Chem. Phys., 2015

The idea was combine it with learning the linear transformation of Cartesian coordinates.

The directions were chosen as principal directions of randomly sampled gradients

TT+AS

Here are results for the water molecule (the AS approach catches only translational degrees of freedom).

Ranks

</div>

Accuracy

</div>

How to compute eigenvalues?

$$ H \psi = E \psi, $$

$\psi \in \mathcal{M}$, where $\mathcal{M}$ is the manifold.

  • Method 1: Do standard iteration + rank reduction. The original iteration should converge fast!
  • Method 2: Formulate the original problem as optimization problem

Representing operators

Before going to methods, I will describe how we approximate Hamiltonians.

We store matrices in the TT-format as well!

$$A(i_1, \ldots, i_d; j_1, \ldots j_d) = A_1(i_1, j_1) \ldots A_d(i_d, j_d).$$

For example, the Laplacian has rank-$2$ in this representation (compare to rank $d$ in the CP-format)

Minimal eigenvalues

$$H \Psi = \Psi \Lambda, \quad F(\Psi) = \mathrm{Tr}(X^* H X), \quad \mbox{s.t. } X^* X = I_r.$$

We represent (Khoromskij, Oseledets, Dolgov, Savostyanov Comp. Phys. Comm., 2014) $X$ in the block-TT format.

One vector is easy: just optimize in alternating least squares fashion.

Many vectors: Add $\beta$ as additional index, $$X(i_1, \ldots, i_d, \beta) \approx X_1(i_1) \ldots X_d(i_d, \beta)$$.

Here the magic comes: orthogonalization comes for free!

Index-in-a-train

The bad news if you want to optimize over $X_k$, it does not work that well.

Solution: Reparametrize each time by setting

$$X(i_1, \ldots, i_d, \beta) \approx X_1(i_1) \ldots X_k(i_k, \beta) \ldots X_d(i_d).$$

Moving from the representation for $k+1$ to $k$ is easy (although can be done only approximately).

Local problems are then linear local eigenvalue problems!

Numerical experiments

We consider the Henon-Heiles potential, and compute $B$ different eigenvalues for different values of $f$.

\begin{equation} \def\Hlap{-\frac12 \Delta} \def\Hhar{\frac12 \sum_{k=1}^f q^2_k} \def\Hanh{\sum_{k=1}^{f-1}\left(q^2_k q_{k+1} - \frac13 q^3_{k+1} \right)} H = \Hlap + \underbrace{\Hhar + \overbrace{\lambda \Hanh}^{\textrm{anharmonic part}}}_{\textrm{Henon-Heiles potential}~V(q_1,\ldots,q_f)}. \end{equation}

Accuracy and timings for eigb method

Summary for eigenvalue part

We have many more ideas how to compute vibrational eigenstates.

But finding constants and evaluating results is tough and we need chemists that are interested in using our methods.

Short demo


In [20]:
import numpy as np
from scipy.linalg import toeplitz
from tt.eigb import *
import tt
import time
from math import pi,sqrt
import quadgauss
f = 30 #The number of degrees of freedom
L = 7 #The domain is [-L, L], periodic
lm = 0.111803 #The magic constant
N = 30 # The size of the spectral discretization
x, ws = quadgauss.cdgqf(N, 6, 0, 0.5) #Generation of hermite quadrature
#Generate Laplacian
lp = np.zeros((N,N))
for i in xrange(N):
    for j in xrange(N):
        if i is not j:
            lp[i,j] = (-1)**(i - j)*(2*(x[i] - x[j])**(-2) - 0.5)
        else:
            lp[i,j] = 1.0/6*(4*N - 1 - 2 * x[i]**2)
lp = tt.matrix(lp)
e = tt.eye([N])
lp2 = None
for i in xrange(f):
    w = lp
    for j in xrange(i):
        w = tt.kron(e,w)
    for j in xrange(i+1,f):
        w = tt.kron(w,e)
    lp2 = lp2 + w
    lp2 = lp2.round(eps)
#Now we will compute Henon-Heiles stuff
xx = []
t = tt.tensor(x)
ee = tt.ones([N])
for  i in xrange(f):
    t0 = t
    for j in xrange(i):
        t0 = tt.kron(ee,t0)
    for j in xrange(i+1,f):
        t0 = tt.kron(t0,ee)
    xx.append(t0)
#Harmonic potential
harm = None
for i in xrange(f):
    harm = harm + (xx[i]*xx[i])
    harm = harm.round(eps)
#Henon-Heiles correction
V = None
for s in xrange(f-1):
    V = V + (xx[s]*xx[s]*xx[s+1] - (1.0/3)*xx[s+1]*xx[s+1]*xx[s+1])
    V = V.round(eps)
A = 0.5*lp2 + tt.diag(0.5*harm + lm*V)
#A0 = 0.5*lp2 + tt.diag(0.5*harm)
A = A.round(eps) 
n = A.n
d = A.tt.d
B = 2 #Number of eigenstates
r = [5]*(d+1)
r[0] = 1
r[d] = B
x0 = tt.rand(n,d,r)
t1 = time.time()
print 'Matrices are done'
y,lam = eigb(A, x0, 1e-6)
t2 = time.time()
print 'Eigenvalues:',lam    
print 'Elapsed time:', t2-t1


Matrices are done
Eigenvalues: [ 14.95842186  15.14273044]
Elapsed time: 8.37523984909

Dynamical problems

\begin{equation} \frac{d\psi}{dt} = i H \psi, \quad \psi(0) = \psi_0, \end{equation}

Dynamical problems can be solved as well. The idea is to use the projector-splitting scheme, which basically reduces to integration of the local problems (linear ODEs) + orthogonalization.

Details are in C. Lubich, I. Oseledets, B. Vandereycken, Time integration of tensor trains, SINUM, 2015.

Spectra comparison

We compare the TT-KSL scheme with the MCTDH package for the benchmark problem with Henon-Heiles potential

Spectra for a 10D problem

</div>

Zoomed spectra for a 10D problem

</div>

Computational time: 54354 (MCTDH) vs 4425 (TT-KSL)

Problems am going to work on

  • Sampling PES in different coordinates (from papers even getting the range, the level of accuracy of electronic structure code can be a nightmare) + parallelizing this stuff

  • Other iterative methods for vibrational eigenstates (many of them!)

  • Redundant coordinates (I think, I have the idea how we can do that on a discrete level)

  • Orbitral-Free DFT methods

  • Path integrals (even have one paper on that, now generalizing to the Scrodinger equation)

  • Develop more robust optimization methods in TT

  • Develop more robust software, including parallel version

Publications and software


In [17]:
from IPython.core.display import HTML
def css_styling():
    styles = open("custom.css", "r").read()
    return HTML(styles)
css_styling()


Out[17]: