Morten Hjorth-Jensen, Department of Physics, University of Oslo and Department of Physics and Astronomy and National Superconducting Cyclotron Laboratory, Michigan State University
Date: Aug 23, 2017
Copyright 1999-2017, Morten Hjorth-Jensen. Released under CC Attribution-NonCommercial 4.0 license
Given a hamiltonian $H$ and a trial wave function $\Psi_T$, the variational principle states that the expectation value of $\langle H \rangle$, defined through
is an upper bound to the ground state energy $E_0$ of the hamiltonian $H$, that is
In general, the integrals involved in the calculation of various expectation values are multi-dimensional ones. Traditional integration methods such as the Gauss-Legendre will not be adequate for say the computation of the energy of a many-body system.
The trial wave function can be expanded in the eigenstates of the hamiltonian since they form a complete set, viz.,
and assuming the set of eigenfunctions to be normalized one obtains
where we used that $H(\boldsymbol{R})\Psi_n(\boldsymbol{R})=E_n\Psi_n(\boldsymbol{R})$. In general, the integrals involved in the calculation of various expectation values are multi-dimensional ones. The variational principle yields the lowest state of a given symmetry.
In most cases, a wave function has only small values in large parts of configuration space, and a straightforward procedure which uses homogenously distributed random points in configuration space will most likely lead to poor results. This may suggest that some kind of importance sampling combined with e.g., the Metropolis algorithm may be a more efficient way of obtaining the ground state energy. The hope is then that those regions of configurations space where the wave function assumes appreciable values are sampled more efficiently.
The tedious part in a VMC calculation is the search for the variational minimum. A good knowledge of the system is required in order to carry out reasonable VMC calculations. This is not always the case, and often VMC calculations serve rather as the starting point for so-called diffusion Monte Carlo calculations (DMC). DMC is a way of solving exactly the many-body Schroedinger equation by means of a stochastic procedure. A good guess on the binding energy and its wave function is however necessary. A carefully performed VMC calculation can aid in this context.
$\boldsymbol{R}=(\boldsymbol{R}_1,\dots ,\boldsymbol{R}_N)$. The trial wave function depends on $\alpha$ variational parameters $\boldsymbol{\alpha}=(\alpha_1,\dots ,\alpha_M)$.
This is our new probability distribution function (PDF). The approximation to the expectation value of the Hamiltonian is now
called the local energy, which, together with our trial PDF yields
with $N$ being the number of Monte Carlo samples.
The Algorithm for performing a variational Monte Carlo calculations runs thus as this
Initialisation: Fix the number of Monte Carlo steps. Choose an initial $\boldsymbol{R}$ and variational parameters $\alpha$ and calculate $\left|\psi_T^{\alpha}(\boldsymbol{R})\right|^2$.
Initialise the energy and the variance and start the Monte Carlo calculation.
Calculate a trial position $\boldsymbol{R}_p=\boldsymbol{R}+r*step$ where $r$ is a random variable $r \in [0,1]$.
Metropolis algorithm to accept or reject this move $w = P(\boldsymbol{R}_p)/P(\boldsymbol{R})$.
If the step is accepted, then we set $\boldsymbol{R}=\boldsymbol{R}_p$.
Update averages
Observe that the jumping in space is governed by the variable step. This is Called brute-force sampling. Need importance sampling to get more relevant sampling, see lectures below.
The radial Schroedinger equation for the hydrogen atom can be written as
or with dimensionless variables
with the hamiltonian
Use variational parameter $\alpha$ in the trial wave function
A simple variational Monte Carlo calculation results in
$\alpha$ | $\langle H \rangle $ | $\sigma^2$ | $\sigma/\sqrt{N}$ |
---|---|---|---|
7.00000E-01 | -4.57759E-01 | 4.51201E-02 | 6.71715E-04 |
8.00000E-01 | -4.81461E-01 | 3.05736E-02 | 5.52934E-04 |
9.00000E-01 | -4.95899E-01 | 8.20497E-03 | 2.86443E-04 |
1.00000E-00 | -5.00000E-01 | 0.00000E+00 | 0.00000E+00 |
1.10000E+00 | -4.93738E-01 | 1.16989E-02 | 3.42036E-04 |
1.20000E+00 | -4.75563E-01 | 8.85899E-02 | 9.41222E-04 |
1.30000E+00 | -4.54341E-01 | 1.45171E-01 | 1.20487E-03 |
We note that at $\alpha=1$ we obtain the exact result, and the variance is zero, as it should. The reason is that we then have the exact wave function, and the action of the hamiltionan on the wave function
yields just a constant. The integral which defines various expectation values involving moments of the hamiltonian becomes then
This gives an important information: the exact wave function leads to zero variance! Variation is then performed by minimizing both the energy and the variance.
The helium atom consists of two electrons and a nucleus with
charge $Z=2$.
The contribution
to the potential energy due to the attraction from the nucleus is
and if we add the repulsion arising from the two interacting electrons, we obtain the potential energy
and Schroedingers equation reads
All observables are evaluated with respect to the probability distribution
2 1
< < < ! ! M A T H _ B L O C K
For small values of $r_1$, the terms which dominate are
and
A similar condition applies to electron 2 as well. For orbital momenta $l > 0$ we have
Similarly, studying the case $r_{12}\rightarrow 0$ we can write a possible trial wave function as
The last equation can be generalized to
for a system with $N$ electrons or particles.
During the development of our code we need to make several checks. It is also very instructive to compute a closed form expression for the local energy. Since our wave function is rather simple it is straightforward to find an analytic expressions. Consider first the case of the simple helium function
The local energy is for this case
which gives an expectation value for the local energy given by
which means that the gradient needed for the so-called quantum force and local energy can be calculated analytically. This will speed up your code since the computation of the correlation part and the Slater determinant are the most time consuming parts in your code.
We will refer to this correlation function as $\Psi_C$ or the linear Pade-Jastrow.
We can test this by computing the local energy for our helium wave function
with $\alpha$ and $\beta$ as variational parameters.
The local energy is for this case
It is very useful to test your code against these expressions. It means also that you don't need to compute a derivative numerically as discussed in the code example below.
For the computation of various derivatives with different types of wave functions, you will find it useful to use python with symbolic python, that is sympy, see online manual. Using sympy allows you autogenerate both Latex code as well c++, python or Fortran codes. Here you will find some simple examples. We choose the $2s$ hydrogen-orbital (not normalized) as an example
with $ r^2 = x^2 + y^2 + z^2$.
In [1]:
from sympy import symbols, diff, exp, sqrt
x, y, z, Z = symbols('x y z Z')
r = sqrt(x*x + y*y + z*z)
r
phi = (Z*r - 2)*exp(-Z*r/2)
phi
diff(phi, x)
In [2]:
from sympy import symbols, diff, exp, sqrt, factor, Symbol, printing
x, y, z, Z = symbols('x y z Z')
r = sqrt(x*x + y*y + z*z)
phi = (Z*r - 2)*exp(-Z*r/2)
R = Symbol('r') #Creates a symbolic equivalent of r
#print latex and c++ code
print printing.latex(diff(phi, x).factor().subs(r, R))
print printing.ccode(diff(phi, x).factor().subs(r, R))
In [3]:
from sympy import symbols, diff, exp, sqrt, factor, Symbol, printing
x, y, z, Z = symbols('x y z Z')
r = sqrt(x*x + y*y + z*z)
phi = (Z*r - 2)*exp(-Z*r/2)
R = Symbol('r') #Creates a symbolic equivalent of r
(diff(diff(phi, x), x) + diff(diff(phi, y), y) + diff(diff(phi, z), z)).factor().subs(r, R)
# Collect the Z values
(diff(diff(phi, x), x) + diff(diff(phi, y), y) +diff(diff(phi, z), z)).factor().collect(Z).subs(r, R)
# Factorize also the r**2 terms
(diff(diff(phi, x), x) + diff(diff(phi, y), y) + diff(diff(phi, z), z)).factor().collect(Z).subs(r, R).subs(r**2, R**2).factor()
print printing.ccode((diff(diff(phi, x), x) + diff(diff(phi, y), y) + diff(diff(phi, z), z)).factor().collect(Z).subs(r, R).subs(r**2, R**2).factor())
With some practice this allows one to be able to check one's own calculation and translate automatically into code lines.
The c++ code with a VMC Solver class, main program first.
#include "vmcsolver.h"
#include <iostream>
using namespace std;
int main()
{
VMCSolver *solver = new VMCSolver();
solver->runMonteCarloIntegration();
return 0;
}
#ifndef VMCSOLVER_H
#define VMCSOLVER_H
#include <armadillo>
using namespace arma;
class VMCSolver
{
public:
VMCSolver();
void runMonteCarloIntegration();
private:
double waveFunction(const mat &r);
double localEnergy(const mat &r);
int nDimensions;
int charge;
double stepLength;
int nParticles;
double h;
double h2;
long idum;
double alpha;
int nCycles;
mat rOld;
mat rNew;
};
#endif // VMCSOLVER_H
#include "vmcsolver.h"
#include "lib.h"
#include <armadillo>
#include <iostream>
using namespace arma;
using namespace std;
VMCSolver::VMCSolver() :
nDimensions(3),
charge(2),
stepLength(1.0),
nParticles(2),
h(0.001),
h2(1000000),
idum(-1),
alpha(0.5*charge),
nCycles(1000000)
{
}
void VMCSolver::runMonteCarloIntegration()
{
rOld = zeros<mat>(nParticles, nDimensions);
rNew = zeros<mat>(nParticles, nDimensions);
double waveFunctionOld = 0;
double waveFunctionNew = 0;
double energySum = 0;
double energySquaredSum = 0;
double deltaE;
// initial trial positions
for(int i = 0; i < nParticles; i++) {
for(int j = 0; j < nDimensions; j++) {
rOld(i,j) = stepLength * (ran2(&idum) - 0.5);
}
}
rNew = rOld;
// loop over Monte Carlo cycles
for(int cycle = 0; cycle < nCycles; cycle++) {
// Store the current value of the wave function
waveFunctionOld = waveFunction(rOld);
// New position to test
for(int i = 0; i < nParticles; i++) {
for(int j = 0; j < nDimensions; j++) {
rNew(i,j) = rOld(i,j) + stepLength*(ran2(&idum) - 0.5);
}
// Recalculate the value of the wave function
waveFunctionNew = waveFunction(rNew);
// Check for step acceptance (if yes, update position, if no, reset position)
if(ran2(&idum) <= (waveFunctionNew*waveFunctionNew) / (waveFunctionOld*waveFunctionOld)) {
for(int j = 0; j < nDimensions; j++) {
rOld(i,j) = rNew(i,j);
waveFunctionOld = waveFunctionNew;
}
} else {
for(int j = 0; j < nDimensions; j++) {
rNew(i,j) = rOld(i,j);
}
}
// update energies
deltaE = localEnergy(rNew);
energySum += deltaE;
energySquaredSum += deltaE*deltaE;
}
}
double energy = energySum/(nCycles * nParticles);
double energySquared = energySquaredSum/(nCycles * nParticles);
cout << "Energy: " << energy << " Energy (squared sum): " << energySquared << endl;
}
double VMCSolver::localEnergy(const mat &r)
{
mat rPlus = zeros<mat>(nParticles, nDimensions);
mat rMinus = zeros<mat>(nParticles, nDimensions);
rPlus = rMinus = r;
double waveFunctionMinus = 0;
double waveFunctionPlus = 0;
double waveFunctionCurrent = waveFunction(r);
// Kinetic energy, brute force derivations
double kineticEnergy = 0;
for(int i = 0; i < nParticles; i++) {
for(int j = 0; j < nDimensions; j++) {
rPlus(i,j) += h;
rMinus(i,j) -= h;
waveFunctionMinus = waveFunction(rMinus);
waveFunctionPlus = waveFunction(rPlus);
kineticEnergy -= (waveFunctionMinus + waveFunctionPlus - 2 * waveFunctionCurrent);
rPlus(i,j) = r(i,j);
rMinus(i,j) = r(i,j);
}
}
kineticEnergy = 0.5 * h2 * kineticEnergy / waveFunctionCurrent;
// Potential energy
double potentialEnergy = 0;
double rSingleParticle = 0;
for(int i = 0; i < nParticles; i++) {
rSingleParticle = 0;
for(int j = 0; j < nDimensions; j++) {
rSingleParticle += r(i,j)*r(i,j);
}
potentialEnergy -= charge / sqrt(rSingleParticle);
}
// Contribution from electron-electron potential
double r12 = 0;
for(int i = 0; i < nParticles; i++) {
for(int j = i + 1; j < nParticles; j++) {
r12 = 0;
for(int k = 0; k < nDimensions; k++) {
r12 += (r(i,k) - r(j,k)) * (r(i,k) - r(j,k));
}
potentialEnergy += 1 / sqrt(r12);
}
}
return kineticEnergy + potentialEnergy;
}
double VMCSolver::waveFunction(const mat &r)
{
double argument = 0;
for(int i = 0; i < nParticles; i++) {
double rSingleParticle = 0;
for(int j = 0; j < nDimensions; j++) {
rSingleParticle += r(i,j) * r(i,j);
}
argument += sqrt(rSingleParticle);
}
return exp(-argument * alpha);
}
#include <armadillo>
#include <iostream>
using namespace arma;
using namespace std;
double ran2(long *);
class VMCSolver
{
public:
VMCSolver();
void runMonteCarloIntegration();
private:
double waveFunction(const mat &r);
double localEnergy(const mat &r);
int nDimensions;
int charge;
double stepLength;
int nParticles;
double h;
double h2;
long idum;
double alpha;
int nCycles;
mat rOld;
mat rNew;
};
VMCSolver::VMCSolver() :
nDimensions(3),
charge(2),
stepLength(1.0),
nParticles(2),
h(0.001),
h2(1000000),
idum(-1),
alpha(0.5*charge),
nCycles(1000000)
{
}
void VMCSolver::runMonteCarloIntegration()
{
rOld = zeros<mat>(nParticles, nDimensions);
rNew = zeros<mat>(nParticles, nDimensions);
double waveFunctionOld = 0;
double waveFunctionNew = 0;
double energySum = 0;
double energySquaredSum = 0;
double deltaE;
// initial trial positions
for(int i = 0; i < nParticles; i++) {
for(int j = 0; j < nDimensions; j++) {
rOld(i,j) = stepLength * (ran2(&idum) - 0.5);
}
}
rNew = rOld;
// loop over Monte Carlo cycles
for(int cycle = 0; cycle < nCycles; cycle++) {
// Store the current value of the wave function
waveFunctionOld = waveFunction(rOld);
// New position to test
for(int i = 0; i < nParticles; i++) {
for(int j = 0; j < nDimensions; j++) {
rNew(i,j) = rOld(i,j) + stepLength*(ran2(&idum) - 0.5);
}
// Recalculate the value of the wave function
waveFunctionNew = waveFunction(rNew);
// Check for step acceptance (if yes, update position, if no, reset position)
if(ran2(&idum) <= (waveFunctionNew*waveFunctionNew) / (waveFunctionOld*waveFunctionOld)) {
for(int j = 0; j < nDimensions; j++) {
rOld(i,j) = rNew(i,j);
waveFunctionOld = waveFunctionNew;
}
} else {
for(int j = 0; j < nDimensions; j++) {
rNew(i,j) = rOld(i,j);
}
}
// update energies
deltaE = localEnergy(rNew);
energySum += deltaE;
energySquaredSum += deltaE*deltaE;
}
}
double energy = energySum/(nCycles * nParticles);
double energySquared = energySquaredSum/(nCycles * nParticles);
cout << "Energy: " << energy << " Energy (squared sum): " << energySquared << endl;
}
double VMCSolver::localEnergy(const mat &r)
{
mat rPlus = zeros<mat>(nParticles, nDimensions);
mat rMinus = zeros<mat>(nParticles, nDimensions);
rPlus = rMinus = r;
double waveFunctionMinus = 0;
double waveFunctionPlus = 0;
double waveFunctionCurrent = waveFunction(r);
// Kinetic energy, brute force derivations
double kineticEnergy = 0;
for(int i = 0; i < nParticles; i++) {
for(int j = 0; j < nDimensions; j++) {
rPlus(i,j) += h;
rMinus(i,j) -= h;
waveFunctionMinus = waveFunction(rMinus);
waveFunctionPlus = waveFunction(rPlus);
kineticEnergy -= (waveFunctionMinus + waveFunctionPlus - 2 * waveFunctionCurrent);
rPlus(i,j) = r(i,j);
rMinus(i,j) = r(i,j);
}
}
kineticEnergy = 0.5 * h2 * kineticEnergy / waveFunctionCurrent;
// Potential energy
double potentialEnergy = 0;
double rSingleParticle = 0;
for(int i = 0; i < nParticles; i++) {
rSingleParticle = 0;
for(int j = 0; j < nDimensions; j++) {
rSingleParticle += r(i,j)*r(i,j);
}
potentialEnergy -= charge / sqrt(rSingleParticle);
}
// Contribution from electron-electron potential
double r12 = 0;
for(int i = 0; i < nParticles; i++) {
for(int j = i + 1; j < nParticles; j++) {
r12 = 0;
for(int k = 0; k < nDimensions; k++) {
r12 += (r(i,k) - r(j,k)) * (r(i,k) - r(j,k));
}
potentialEnergy += 1 / sqrt(r12);
}
}
return kineticEnergy + potentialEnergy;
}
double VMCSolver::waveFunction(const mat &r)
{
double argument = 0;
for(int i = 0; i < nParticles; i++) {
double rSingleParticle = 0;
for(int j = 0; j < nDimensions; j++) {
rSingleParticle += r(i,j) * r(i,j);
}
argument += sqrt(rSingleParticle);
}
return exp(-argument * alpha);
}
/*
** The function
** ran2()
** is a long periode (> 2 x 10^18) random number generator of
** L'Ecuyer and Bays-Durham shuffle and added safeguards.
** Call with idum a negative integer to initialize; thereafter,
** do not alter idum between sucessive deviates in a
** sequence. RNMX should approximate the largest floating point value
** that is less than 1.
** The function returns a uniform deviate between 0.0 and 1.0
** (exclusive of end-point values).
*/
#define IM1 2147483563
#define IM2 2147483399
#define AM (1.0/IM1)
#define IMM1 (IM1-1)
#define IA1 40014
#define IA2 40692
#define IQ1 53668
#define IQ2 52774
#define IR1 12211
#define IR2 3791
#define NTAB 32
#define NDIV (1+IMM1/NTAB)
#define EPS 1.2e-7
#define RNMX (1.0-EPS)
double ran2(long *idum)
{
int j;
long k;
static long idum2 = 123456789;
static long iy=0;
static long iv[NTAB];
double temp;
if(*idum <= 0) {
if(-(*idum) < 1) *idum = 1;
else *idum = -(*idum);
idum2 = (*idum);
for(j = NTAB + 7; j >= 0; j--) {
k = (*idum)/IQ1;
*idum = IA1*(*idum - k*IQ1) - k*IR1;
if(*idum < 0) *idum += IM1;
if(j < NTAB) iv[j] = *idum;
}
iy=iv[0];
}
k = (*idum)/IQ1;
*idum = IA1*(*idum - k*IQ1) - k*IR1;
if(*idum < 0) *idum += IM1;
k = idum2/IQ2;
idum2 = IA2*(idum2 - k*IQ2) - k*IR2;
if(idum2 < 0) idum2 += IM2;
j = iy/NDIV;
iy = iv[j] - idum2;
iv[j] = *idum;
if(iy < 1) iy += IMM1;
if((temp = AM*iy) > RNMX) return RNMX;
else return temp;
}
#undef IM1
#undef IM2
#undef AM
#undef IMM1
#undef IA1
#undef IA2
#undef IQ1
#undef IQ2
#undef IR1
#undef IR2
#undef NTAB
#undef NDIV
#undef EPS
#undef RNMX
// End: function ran2()
#include <iostream>
using namespace std;
int main()
{
VMCSolver *solver = new VMCSolver();
solver->runMonteCarloIntegration();
return 0;
}
The Metropolis algorithm , see the original article (see also the FYS3150 lectures) was invented by Metropolis et. al and is often simply called the Metropolis algorithm. It is a method to sample a normalized probability distribution by a stochastic process. We define ${\cal P}_i^{(n)}$ to be the probability for finding the system in the state $i$ at step $n$. The algorithm is then
Sample a possible new state $j$ with some probability $T_{i\rightarrow j}$.
Accept the new state $j$ with probability $A_{i \rightarrow j}$ and use it as the next sample. With probability $1-A_{i\rightarrow j}$ the move is rejected and the original state $i$ is used again as a sample.
We wish to derive the required properties of $T$ and $A$ such that ${\cal P}_i^{(n\rightarrow \infty)} \rightarrow p_i$ so that starting from any distribution, the method converges to the correct distribution. Note that the description here is for a discrete probability distribution. Replacing probabilities $p_i$ with expressions like $p(x_i)dx_i$ will take all of these over to the corresponding continuum expressions.
The dynamical equation for ${\cal P}_i^{(n)}$ can be written directly from the description above. The probability of being in the state $i$ at step $n$ is given by the probability of being in any state $j$ at the previous step, and making an accepted transition to $i$ added to the probability of being in the state $i$, making a transition to any state $j$ and rejecting the move:
Since the probability of making some transition must be 1, $\sum_j T_{i\rightarrow j} = 1$, and the above equation becomes
The balance requirement is very weak. Typically the much stronger detailed balance requirement is enforced, that is rather than the sum being set to zero, we set each term separately to zero and use this to determine the acceptance probabilities. Rearranging, the result is
Other choices are possible, but they all correspond to multilplying $A_{i\rightarrow j}$ and $A_{j\rightarrow i}$ by the same constant smaller than unity.\footnote{The penalty function method uses just such a factor to compensate for $p_i$ that are evaluated stochastically and are therefore noisy.}
Having chosen the acceptance probabilities, we have guaranteed that if the ${\cal P}_i^{(n)}$ has equilibrated, that is if it is equal to $p_i$, it will remain equilibrated. Next we need to find the circumstances for convergence to equilibrium.
The dynamical equation can be written as
with the matrix $M$ given by
Summing over $i$ shows that $\sum_i M_{ij} = 1$, and since $\sum_k T_{i\rightarrow k} = 1$, and $A_{i \rightarrow k} \leq 1$, the elements of the matrix satisfy $M_{ij} \geq 0$. The matrix $M$ is therefore a stochastic matrix.
The Metropolis method is simply the power method for computing the right eigenvector of $M$ with the largest magnitude eigenvalue. By construction, the correct probability distribution is a right eigenvector with eigenvalue 1. Therefore, for the Metropolis method to converge to this result, we must show that $M$ has only one eigenvalue with this magnitude, and all other eigenvalues are smaller.