Diffusion Monte Carlo: Example!

Introduction

We want to simulate a zero temperature system using the Diffusion Monte Carlo technique. Diffusion Monte Carlo is one type of Projector Monte Carlo. The idea in Projector Monte Carlo methods is to start with a trial wavefunction, $\left|\Psi_T\right\rangle$. Then, we apply a projection operator many times to this trial state, hopefully driving the system into its groundstate (for now we will not consider finite temperature simulation). There are many different projectors we might apply:

  1. $\exp\{ -\beta\hat{H} \}$ - Diffusion Monte Carlo
  2. $\left[1 + \tau \hat{H}\right]^{-1}$ - Greens Function Monte Carlo
  3. $1 - \tau \hat{H}$ - Power Monte Carlo

We can simulate in continuous or discrete time. For now we will handle the discrete case since it is simpler. We can do a Trotter discretization on the projector, and obtain the operator we want to use: $$ \hat{P} = \left( 1- \tau \hat{H} \right)^{N} $$ $$ \hat{P}\left|\Psi_T\right\rangle = \left|\Psi_0\right\rangle $$ We pick $\tau$ small to improve the accuracy of the simulation. We also want $N$ to be large. Hopefully it is clear that enough applications of $\hat{G} = \left( 1- \tau \hat{H} \right)$ will drive the system into its groundstate. Questions we need to answer to be able to write a functioning code:

  1. What basis should we represent our wavefunction guesses in? (If we knew the eigenbasis, we would not be doing QMC.)
  2. How do we update our guesses?
  3. How can we control the wavefunction so that we only focus on the most important contributions?
  4. How do we measure observables (there is a nasty little catch here)?

Choosing a Basis

We need some way to internally represent the wavefunction as we update it, and something that will easily allow us to figure out how the Hamiltonian acts. It is convenient to work in second quantization. Then if we have a lattice of $N$ sites, we can represent our wavefunction(s) in the number basis. A specific configuration $\left|c\right\rangle$ is a string of creation operations on specific sites. As an example, we might write: $$ \left| 0 1 0 1 1 0 \right\rangle = c^{\dagger}_4 c^{\dagger}_2 c^{\dagger}_1 \left|0\right\rangle $$ The description on the left is very easy to encode on the computer, especially if we have a hard-core/Gutzwiller constraint (no double or more occupancy of any site). In the second quantized basis, we will have to be careful about moving operators past one another in case the operators are fermionic - we may cause a sign problem. Now we can construct some basis states. For small systems, it's feasible to construct the entire Hilbert space. For bigger systems, it would take an extremely long time and an extremely large amount of memory to do this. For this reason, we usually generate possible moves on the fly, and then pick one move using a weighted random variable. Thus, we can pick our starting wavefunction however we'd like, provided it has non-zero overlap with the groundstate. If our starting state is orthogonal to the groundstate, we will never reach the groundstate by applications of the projector. Usually we have some rules that states in the Hilbert space must satisfy (e.g. no double occupancy, a certain filling fraction). In this example, we can create a class to store a current configuration in the number basis as well as some other information we'll need for the computation later.


In [25]:
import random

class State:
    def __init__ (self, sites):
        self.configuration = [ 0 ] * sites # Start out assuming no particles are present
        self.weight        = 1.0           # Default "importance" is 1.0
        self.history       = [ ]
    # Generates a configuration with rowCount bosons per row, and colCount per column. 
    # The bosons are hard core (no double occupancy).
    def generateConfiguration(self, rowCount, colCount, rows, cols ):
        
        rowList   = range( rows )
        random.shuffle( rowList )
        
        rowCounts = [ 0 ] * rows
        colCounts = [ 0 ] * cols
        
        for ii in rowList:
            colList   = range( cols )
            random.shuffle( colList )
            for jj in colList:
                if( colCounts[ jj ] < colCount and rowCounts[ ii ] < rowCount ):
                    self.configuration[ ii * cols + jj ] = 1 # if the site is free, place a particle
                    colCounts[ jj ] += 1
                    rowCounts[ ii ] += 1

test = State( 36 )
test.generateConfiguration( 3, 3, 6, 6 )

print test.configuration


[1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1]

Updating Configurations

We need to figure out how to apply the projection operator to the state we have. We can write another class to represent the projection operator, which has only one type of non-identity operator - the ring exchange. Below we can see how to do the single plaquette version - you can extend this to larger sizes as an exercise:


In [26]:
class Projector:
    def __init__(self, couplings, bondList, tau ):
        self.parameters = couplings
        self.tau        = tau
        self.bonds      = bondList
        
    def generateMoves( self, configuration ):
        updateList = []
        sites = len(configuration)
        identity = configuration, 1.0
        updateList.append( identity )
        for site in range( sites ):
            horizontalSite = self.bonds[ site ][ 0 ]
            verticalSite   = self.bonds[ site ][ 1 ] # Assuming 2D Bravais lattice
            diagonalSite   = self.bonds[ horizontalSite ][ 1 ]
            
            # The ring exchange model requires configurations like
            # 0 0 0 0 
            # 0 0 1 0 
            # 0 1 0 0
            # 0 0 0 0
            # to make a move - only 2 of four sites may be occupied
            siteOccupied = configuration[ site ]
            diagonalOk   = False
            horizontalOk = False
            verticalOk   = False
            if siteOccupied :
                diagonalOk   = configuration[ diagonalSite ]
                horizontalOk = not configuration[ horizontalSite ]
                verticalOk   = not configuration[ verticalSite ]
            else :
                diagonalOk   = not configuration[ diagonalSite ]
                horizontalOk = configuration[ horizontalSite ]
                verticalOk   = configuration[ verticalSite ]
            
            movePossible = diagonalOk and horizontalOk and verticalOk
        
            if( movePossible ):
                newConfiguration = configuration
                if( siteOccupied ):
                    newConfiguration[ site ]           = 0
                    newConfiguration[ diagonalSite ]   = 0
                    newConfiguration[ horizontalSite ] = 1
                    newConfiguration[ verticalSite ]   = 1
                else:
                    newConfiguration[ site ]           = 1
                    newConfiguration[ diagonalSite ]   = 1
                    newConfiguration[ horizontalSite ] = 0
                    newConfiguration[ verticalSite ]   = 0
                update = newConfiguration, -self.parameters[ 0 ]*self.tau
                updateList.append( update )
        return updateList

Performing an Update

We need to glue our two classes together, and use Projector to update the configuration stored in State. We will also need to update the weight. We need to generate a list of possible states to move to (we already know how to do this from the function above). Then, we must pick a new state and update the weight of the walker accordingly. We may also want to keep track of where the walker has been in configuration space, so that we can keep more information about its path (this will be important later, when we make measurements).


In [27]:
def updateState( current, G ):

    updateList = G.generateMoves( current.configuration )
    # All movement probabilities are guaranteed to be positive
    # because we're working with bosons. If we were working 
    # with fermions, we would take the absolute value of the 
    # element of the projector to determine whether or not to move
    totalWeight = 0.0
    for update in updateList:
        totalWeight += update[ 1 ]
    
    cumulantList = []
    for i,update in enumerate(updateList):
        if( i == 0 ):
            cumulantList.append( update )
        else:
            cumulant = update[ 0 ], update[ 1 ] + cumulantList[ i - 1 ][ 1 ]
            cumulantList.append( cumulant )
    random.seed()
    needle   = random.uniform( 0.0,  totalWeight )
    newState = State( 36 )
    if( cumulantList[ 0 ][ 1 ] >= needle ):
        newState.configuration = cumulantList[ 0 ][ 0 ]
        newState.weight        = current.weight * totalWeight
        newState.history       = current.history
    for i in range( len( cumulantList ) - 1 ):
        if ( cumulantList[ i ][ 1 ] < needle and cumulantList[ i + 1 ][ 1 ] >= needle ):
            newState.configuration = cumulantList[ i + 1 ][ 0 ]
            newState.weight        = current.weight * totalWeight
            newState.history       = current.history
    newState.history.insert( 0, current.configuration )
    return newState

Handling Populations

We could continue to just update our states over and over, and eventually make some measurements. But this is not the best idea, because we want to focus on the biggest contributions to the wavefunction. Some contributions are much more important than others, and our walkers will accumulate very different amounts of weight. How can we resample the wavefunction to focus on the most important states? We can use a similar procedure to the one for updating the state in the first place. Doing such a resampling (with fixed total population) introduces some statistical bias, which becomes less of a problem the more walkers we use.


In [28]:
# First, create the projector
bondList = [ 
    [1,6], [2,7], [3,8], [4,9], [5,10], [0,11] ,
    [7,12], [8,13], [9,14], [10,15], [11,16], [6,17] ,
    [13,18], [14,19], [15,20], [16,21], [17,22], [12,23],
    [19,24], [20,25], [21,26], [22,27], [23,28], [18,29],
    [25,30], [26,31], [27,32], [28,33], [29,34], [24,35],
    [31, 0], [32,1], [33,2], [34,3], [35,4], [30,5]
]

G = Projector( [ -1.0 ], bondList, 0.1 )

In [29]:
# Try creating then updating some states!

stateList = []
for i in range( 100 ):
    state = State( 36 )
    state.generateConfiguration( 3, 3, 6, 6 )
    stateList.append( state )

for update in range( 20 ):
    for i,state in enumerate(stateList):
        current = updateState( state, G )
        stateList[ i ] = current

Now we have a set of states in which some are much more important than others. Let's resample the wavefunction to focus on them:


In [30]:
def rebalanceStates( stateList ):
    # Form a list of cumulants again
    cumulantList = []
    for i,state in enumerate(stateList):
        if( i == 0 ):
            cumulantList.append( state )
        else:
            cumulant = state
            cumulant.weight += cumulantList[ i - 1 ].weight
            cumulantList.append( cumulant )
            
    accumulatedWeight = cumulantList[ len(stateList) - 1 ].weight
    for i,state in enumerate(stateList):
        needle = random.uniform( 0.0,  accumulatedWeight )
        for j in range( len(cumulantList) - 1):
            if ( cumulantList[ j ].weight < needle and cumulantList[ j + 1 ].weight >= needle ):
                stateList[ i ].configuration = cumulantList[ j + 1 ].configuration
                stateList[ i ].weight        = 1.0
    return accumulatedWeight

accumulated = rebalanceStates( stateList )
print accumulated


52817963.2962

We can run the simulation for many, many updates and rebalances. Doing this should drive us into a representation of the groundstate in our basis. It is a good idea to take many measurements of the groundstate. We have to keep track of how much total weight we've accumulated between each rebalance so that we know how to take a correct average of our samples, and so that we can correct for the error we introduce by throwing some information away.

Making a Measurement

Now we can write some local observables down. An example measurement we might like to make is the charge-charge correlator. Here we can easily measure this:


In [31]:
def computeChargeCorrelator( siteI, siteJ, stateList ):
    correlation = 0.0
    totalWeight = 0.0
    for i, state in enumerate( stateList ):
        correlation += state.configuration[ siteI ] * state.configuration[ siteJ ] * state.weight
        totalWeight += state.weight
    return correlation / totalWeight

import numpy
import matplotlib.pyplot as plt
import pylab
from mpl_toolkits.mplot3d import axes3d, Axes3D
from matplotlib import cm
from matplotlib.ticker import MultipleLocator, FixedFormatter

rows = 6
cols = 6
sites = rows * cols

chargeStructureFactor = numpy.zeros( (rows + 1, cols + 1), float )
qx = numpy.arange(-numpy.pi,numpy.pi + numpy.pi/rows, 2*numpy.pi/rows )
qy = numpy.arange(-numpy.pi,numpy.pi + numpy.pi/cols, 2*numpy.pi/cols )
Qx, Qy = numpy.meshgrid(qx,qy)

for kx, qxx in enumerate(qx):
    for ky, qyy in enumerate(qy):
        for i in range(sites):
            for j in range(sites):
                ninj = computeChargeCorrelator( i, j, stateList ) - 0.25
                chargeStructureFactor[kx,ky] += ninj*numpy.cos( qxx * (i % cols - j % cols) + qyy * (i / cols - j/cols))/sites

#Plotting the structure factor in 3D can give us some information, but also obscure things
fig = pylab.figure(figsize=(15,10))
                
majorlocator = MultipleLocator( numpy.pi/2.0 )
majorformatter = FixedFormatter( [r'$-\pi$', r'$-\frac{\pi}{2}$', r'$0$', r'$\frac{\pi}{2}$',r'$\pi$'] )

axes = fig.add_subplot(111, projection='3d')
axes.plot_surface( Qx, Qy, chargeStructureFactor, cmap=cm.YlGnBu, rstride=1, cstride=1, linewidth=1 )

axes.xaxis.set_major_locator( majorlocator )
axes.xaxis.set_major_formatter( majorformatter )
axes.yaxis.set_major_locator( majorlocator )
axes.yaxis.set_major_formatter( majorformatter )

axes.set_title("Naive Calculation of Charge Structure Factor")
axes.set_xlabel(r'$q_x$')
axes.set_ylabel(r'$q_y$')
axes.set_zlabel(r'$S(\mathbf{q})$')


Out[31]:
<matplotlib.text.Text at 0x10d455950>

The 3D plot looks nice but it can be hard to see important features. Naively, it might look like we've done the right thing here. However, this result is actually (quantitatively) very wrong! This is due to mixed estimator error. We know how to measure observables in quantum mechanics: \begin{align*} \left\langle \hat{O}\right\rangle =& \frac{\left\langle \Psi_0 \left| \hat{O} \right| \Psi_0 \right\rangle}{\left\langle \Psi_0 | \Psi_0 \right\rangle} \\ =& \frac{\displaystyle\sum_{|c\rangle}\displaystyle\sum_{|c'\rangle}\left\langle \Psi_0 | c \right\rangle\left\langle c \left| \hat{O} \right| c' \right\rangle\left\langle c' | \Psi_0 \right\rangle}{\displaystyle\sum_{|c\rangle}\left|\left\langle \Psi_0 | c \right\rangle\right|^2} \\ =& \frac{\displaystyle\sum_{|c\rangle}\displaystyle\sum_{|c'\rangle}\left\langle \Psi_T \left| e^{-\beta\hat{H}} \right| c \right\rangle\left\langle c \left| \hat{O} \right| c' \right\rangle\left\langle c' \left| e^{-\beta\hat{H}} \right| \Psi_T \right\rangle}{\displaystyle\sum_{|c\rangle}\left|\left\langle \Psi_T \left| e^{-\beta\hat{H}} \right| c \right\rangle\right|^2} \\ \end{align*} What did we actually do when we made the measurement in the code above? \begin{align*} \left\langle \hat{O}\right\rangle_{T} =& \frac{ \displaystyle\sum_{\left|c\right\rangle} w_c O(c) }{\displaystyle\sum_{\left|c\right\rangle} w_c } \\ =& \frac{ \displaystyle\sum_{\left|c\right\rangle} \displaystyle\sum_{\left|c'\right\rangle} \left\langle \Psi_0 | c \right\rangle \left\langle c \left| \hat{O} \right| c' \right\rangle \left\langle c' | \Psi_T\right\rangle }{\displaystyle\sum_{\left|c\right\rangle} \left\langle \Psi_0 | c \right\rangle } \\ =& \frac{ \displaystyle\sum_{\left|c\right\rangle} \displaystyle\sum_{\left|c'\right\rangle} \left\langle \Psi_T \left| e^{-\beta\hat{H}} \right| c \right\rangle \left\langle c \left| \hat{O} \right| c' \right\rangle \left\langle c' | \Psi_T\right\rangle }{\displaystyle\sum_{\left|c\right\rangle} \left\langle \Psi_T \left| e^{-\beta\hat{H}} \right| c \right\rangle } \\ \end{align*} This looks bad. What we measured above is not a good quantum mechanical average! This is the mixed estimator error. If $\hat{O}$ commutes with $\hat{H}$, then we might be ok, because we could use: [ e^{-\beta\hat{H}}\hat{O} = e^{-\beta\hat{H}/2}\hat{O}e^{-\beta\hat{H}/2} ] which would fix everything. Thus, it's fine to measure the energy using the naive average. But usually $\hat{O}$ does not commute with the Hamiltonian. What can we do? This is why we store the history of each walker when doing the update - we can use this history to correct the mixed estimator error. If we had something like: \begin{align*} \left\langle \hat{O} \right\rangle_T =& \frac{\displaystyle\sum_{|c\rangle}\displaystyle\sum_{|c'\rangle}\left\langle \Psi_T \left| e^{-\beta\hat{H} + \alpha\hat{H}} \right| c \right\rangle\left\langle c \left| \hat{O} \right| c' \right\rangle\left\langle c' \left| e^{-\alpha\hat{H}} \right| \Psi_T \right\rangle}{\displaystyle\sum_{|c\rangle}\left|\left\langle \Psi_T \left| e^{-\beta\hat{H}} \right| c \right\rangle\right|^2} \end{align*} Then things would be ok if $\alpha$ is large enough. What this means in practice is that we can do the average with the weights from the end of the simulation, and the configurations from the history of the walker. This is called forward walking. Note: if you take multiple measurements, remember that they are not all weighted equally! You will need to use the accumulated weight at each rebalance to weight the average!


In [32]:
accum = [ 0 ] * 100
for sweeps in range( 100 ):
    for update in range( 20 ):
        for i,state in enumerate(stateList):
            current = updateState( state, G )
            stateList[ i ] = current
    accum[ sweeps ] = rebalanceStates( stateList )

totalWeight = numpy.sum( accum )
averageWeight = numpy.average( accum )

#Let's forward walk 30 configurations back!
#Remember, after the rebalance, stateList's weights are all 1.0
def computeChargeCorrelatorFW( siteI, siteJ, stateList, step ):
    correlation = 0.0
    totalWeight = 0.0
    for i, state in enumerate( stateList ):
        correlation += state.history[ step ][ siteI ] * state.history[ step ][ siteJ ] * state.weight
        totalWeight += state.weight
    return correlation / totalWeight

for kx, qxx in enumerate(qx):
    for ky, qyy in enumerate(qy):
        for i in range(sites):
            for j in range(sites):
                ninj = computeChargeCorrelatorFW( i, j, stateList, 30 ) - 0.25
                chargeStructureFactor[kx,ky] += ninj*numpy.cos( qxx * (i % cols - j % cols) + qyy * (i / cols - j/cols))/sites

fig = pylab.figure(figsize=(15,10))
                
majorlocator = MultipleLocator( numpy.pi/2.0 )
majorformatter = FixedFormatter( [r'$-\pi$', r'$-\frac{\pi}{2}$', r'$0$', r'$\frac{\pi}{2}$',r'$\pi$'] )

axes = fig.add_subplot(111, projection='3d')
axes.plot_surface( Qx, Qy, chargeStructureFactor, cmap=cm.YlGnBu, rstride=1, cstride=1, linewidth=1 )

axes.xaxis.set_major_locator( majorlocator )
axes.xaxis.set_major_formatter( majorformatter )
axes.yaxis.set_major_locator( majorlocator )
axes.yaxis.set_major_formatter( majorformatter )

axes.set_title("Forward Walking Calculation of Charge Structure Factor")
axes.set_xlabel(r'$q_x$')
axes.set_ylabel(r'$q_y$')
axes.set_zlabel(r'$S(\mathbf{q})$')


Out[32]:
<matplotlib.text.Text at 0x10d42ee10>

Let's make a contour plot too!


In [33]:
fig = pylab.figure(figsize=(15,10))
                
majorlocator = MultipleLocator( numpy.pi/2.0 )
majorformatter = FixedFormatter( [r'$-\pi$', r'$-\frac{\pi}{2}$', r'$0$', r'$\frac{\pi}{2}$',r'$\pi$'] )

axes = fig.add_subplot(1,1,1)
axes.contourf( qx, qy, chargeStructureFactor, 15, cmap=cm.YlGnBu )

axes.xaxis.set_major_locator( majorlocator )
axes.xaxis.set_major_formatter( majorformatter )
axes.yaxis.set_major_locator( majorlocator )
axes.yaxis.set_major_formatter( majorformatter )

axes.set_title("Forward Walking Calculation of Charge Structure Factor")
axes.set_xlabel(r'$q_x$')
axes.set_ylabel(r'$q_y$')


Out[33]:
<matplotlib.text.Text at 0x10d425e90>