Tasks covered in this notebook:
ToyDynamics
packagePath sampling methods require that the user supply an input path for each path ensemble. This means that you must somehow generate a first input path. In general, getting the first input paths for TIS boils down to solving two problems:
Since transition paths satisfy all the path ensembles between two states, a common approach is to use the same initial path for all path ensembles (reversing it where necessary). This way, once you've solved problem 1, problem 2 is trivial.
The first rare path can come from any number of sources. The obvious approach might be to find a transition path from a committor analysis, because this would give you a path that satisfies the true dynamics. However, the initial path doesn't need to satisfy the true dynamics of the system (later we can equilibrate the path ensemble anyway). This means that other approaches, such as high-temperature trajectories, can be used for the first path. One of the most widely-used methods to get an initial trajectory is to generate a transition trajectory using metadynamics. However, the downside of using paths that don't satisfy the true dynamics is that it might be harder to get them to equilibrate to the correct path ensemble. It seems that this has been more of a problem with paths from high temperature runs than with paths from metadynamics: in the first metadynamics transition, the barrier region is often quite similar to the native dynamics.
In this example, we use a bootstrapping approach, which does create paths satisfying the true dynamics of the system. This bootstrapping is nice because it is quick and convenient, although it works best on smaller systems with less complicated transitions. It works by running normal MD to generate a path that satisfies the innermost interface, and then performing shooting moves in that interface's path ensemble until we have a path that crosses the next interface. Then we switch to the path ensemble for the next interface, and shoot until the path crossing the interface after that. The process continues until we have paths for all interfaces.
In [1]:
# Basic imports
%matplotlib inline
import openpathsampling as paths
import numpy as np
# used for visualization of the 2D toy system
# we use the %run magic because this isn't in a package
%run ../resources/toy_plot_helpers.py
First we set up our system: for the toy dynamics, this involves defining a potential energy surface (PES), setting up an integrator, and giving the simulation an initial configuration. In real MD systems, the PES is handled by the combination of a topology file and a force field definition, and the initial configuration would come from a file instead of being described by hand.
In [2]:
# convenience for the toy dynamics
import openpathsampling.engines.toy as toys
First we need to describe the system we'll be simulating. With biomolecular systems, this is often done with an initial PDB structure and a choice of force field. For the toy model, we need to give a snapshot as a template, as well as a potential energy surface. The template snapshot also includes a pointer to the topology information (which is relatively simple for the toy systems.)
In [3]:
# Toy_PES supports adding/subtracting various PESs.
# The OuterWalls PES type gives an x^6+y^6 boundary to the system.
pes = (
toys.OuterWalls([1.0, 1.0], [0.0, 0.0]) +
toys.Gaussian(-0.7, [12.0, 12.0], [0.0, 0.4]) +
toys.Gaussian(-0.7, [12.0, 12.0], [-0.5, -0.5]) +
toys.Gaussian(-0.7, [12.0, 12.0], [0.5, -0.5])
)
topology=toys.Topology(
n_spatial = 2,
masses =[1.0, 1.0],
pes = pes
)
In [4]:
integ = toys.LangevinBAOABIntegrator(dt=0.02, temperature=0.1, gamma=2.5)
options={
'integ' : integ,
'n_frames_max' : 5000,
'n_steps_per_frame' : 1
}
toy_eng = toys.Engine(
options=options,
topology=topology
)
toy_eng.initialized = True
In [5]:
template = toys.Snapshot(
coordinates=np.array([[-0.5, -0.5]]),
velocities=np.array([[0.0,0.0]]),
engine=toy_eng
)
In [6]:
toy_eng.current_snapshot = template
Finally, we make this engine into the default engine for any PathMover
that requires an one (e.g., shooting movers, minus movers).
In [7]:
paths.PathMover.engine = toy_eng
Now let's look at the potential energy surface we've created:
In [8]:
plot = ToyPlot()
plot.contour_range = np.arange(-1.5, 1.0, 0.1)
plot.add_pes(pes)
fig = plot.plot()
TIS methods usually require that you define states and interfaces before starting the simulation. State and interfaces are both defined in terms of Volume
objects. The most common type of Volume
is one based on some set of collective variables, so the first thing we have to do is to define the collective variable.
For this system, we'll define the collective variables as circles centered on the middle of the state. OPS allows us to define one function for the circle, which is parameterized by different centers. Note that each collective variable is in fact a separate function.
In [9]:
def circle(snapshot, center):
import math
return math.sqrt((snapshot.xyz[0][0]-center[0])**2 + (snapshot.xyz[0][1]-center[1])**2)
opA = paths.CoordinateFunctionCV(name="opA", f=circle, center=[-0.5, -0.5])
opB = paths.CoordinateFunctionCV(name="opB", f=circle, center=[0.5, -0.5])
opC = paths.CoordinateFunctionCV(name="opC", f=circle, center=[0.0, 0.4])
Now we define the states and interfaces in terms of these order parameters. The CVRangeVolumeSet
gives a shortcut to create several volume objects using the same collective variable.
In [10]:
stateA = paths.CVDefinedVolume(opA, 0.0, 0.2)
stateB = paths.CVDefinedVolume(opB, 0.0, 0.2)
stateC = paths.CVDefinedVolume(opC, 0.0, 0.2)
interfacesA = paths.VolumeInterfaceSet(opA, 0.0, [0.2, 0.3, 0.4])
interfacesB = paths.VolumeInterfaceSet(opB, 0.0, [0.2, 0.3, 0.4])
interfacesC = paths.VolumeInterfaceSet(opC, 0.0, [0.2, 0.3, 0.4])
In [11]:
ms_outers = paths.MSOuterTISInterface.from_lambdas(
{ifaces: 0.5
for ifaces in [interfacesA, interfacesB, interfacesC]}
)
mstis = paths.MSTISNetwork(
[(stateA, interfacesA),
(stateB, interfacesB),
(stateC, interfacesC)],
ms_outers=ms_outers
)
Now we actually run the bootstrapping calculation. The full_bootstrap
function requires an initial snapshot in the state, and then it will generate trajectories satisfying TIS ensemble for the given interfaces. To fill all the ensembles in the MSTIS network, we need to do this once for each initial state.
In [12]:
initA = toys.Snapshot(
coordinates=np.array([[-0.5, -0.5]]),
velocities=np.array([[1.0,0.0]]),
)
bootstrapA = paths.FullBootstrapping(
transition=mstis.from_state[stateA],
snapshot=initA,
engine=toy_eng,
extra_interfaces=[ms_outers.volume_for_interface_set(interfacesA)]
)
gsA = bootstrapA.run()
In [13]:
initB = toys.Snapshot(
coordinates=np.array([[0.5, -0.5]]),
velocities=np.array([[-1.0,0.0]]),
)
bootstrapB = paths.FullBootstrapping(
transition=mstis.from_state[stateB],
snapshot=initB,
engine=toy_eng,
)
gsB = bootstrapB.run()
In [14]:
initC = toys.Snapshot(
coordinates=np.array([[0.0, 0.4]]),
velocities=np.array([[0.0,-0.5]]),
)
bootstrapC = paths.FullBootstrapping(
transition=mstis.from_state[stateC],
snapshot=initC,
engine=toy_eng,
)
gsC = bootstrapC.run()
Now that we've done that for all 3 states, let's look at the trajectories we generated.
In [15]:
plot.plot([s.trajectory for s in gsA]+[s.trajectory for s in gsB]+[s.trajectory for s in gsC]);
Finally, we join these into one SampleSet
. The function relabel_replicas_per_ensemble
ensures that the trajectory associated with each ensemble has a unique replica ID.
In [16]:
total_sample_set = paths.SampleSet.relabel_replicas_per_ensemble(
[gsA, gsB, gsC]
)
Up to this point, we haven't stored anything in files. In other notebooks, a lot of the storage is done automatically. Here we'll show you how to store a few things manually. Instead of storing the entire bootstrapping history, we'll only store the final trajectories we get out.
First we create a file. When we create it, the file also requires the template
snapshot.
In [17]:
storage = paths.Storage("mstis_bootstrap.nc", "w")
The storage will recursively store data, so storing total_sample_set
leads to automatic storage of all the Sample
objects in that sampleset, which in turn leads to storage of all the ensemble, trajectories, and snapshots.
Since the path movers used in bootstrapping and the engine are not required for the sampleset, they would not be stored. We explicitly store the engine for later use, but we won't need the path movers, so we don't try to store them.
The sync_all()
function ensures that all the data that has been created is also saved to the file.
In [18]:
storage.save(total_sample_set)
storage.save(toy_eng)
Out[18]:
Now we check to make sure that we actually have stored the objects that we claimed to store. There should be 0 pathmovers, 1 engine, 12 samples (4 samples from each of 3 transitions), and 1 sampleset. There will be some larger number of snapshots. There will also be a larger number of ensembles, because each ensemble is defined in terms of subensembles, each of which gets saved.
In [19]:
print("PathMovers:", len(storage.pathmovers))
print("Engines:", len(storage.engines))
print("Samples:", len(storage.samples))
print("SampleSets:", len(storage.samplesets))
print("Snapshots:", len(storage.snapshots))
print("Ensembles:", len(storage.ensembles))
print("CollectiveVariables:", len(storage.cvs))
Finally, we close the storage. Not strictly necessary, but a good habit.
In [20]:
storage.close()
In [ ]: