This notebook will take an old (and large) TPS simulation file, select some snapshots to use as input data.
Note: this first version is quick and dirty. There might be some points to consider to select better snapshots. But this is just intended to get initial data to our colleagues.
In [1]:
import openpathsampling as paths
import random
In [2]:
%%time
storage = paths.Storage("alanine_dipeptide_tps.nc", "r")
In [3]:
print storage.file_size_str
In [4]:
n_snapshots = len(storage.snapshots)
print n_snapshots
In [5]:
stateA = storage.volumes['C_7eq']
stateB = storage.volumes['alpha_R']
Now we do the main calculation: every snapshot must not be in a state, and we never re-use a snapshot. (In other words, randomly chosen without replacement.)
In addition, OPS snapshots are always listed in pairs, with velocities reversed. (The data is only stored once, but both can be accessed directly from the snapshot storage.) Because of this, we'll make sure we only take the even-numbered snapshots.
In [6]:
%%time
snapshots = []
while len(snapshots) < 1000:
random_choice = random.randint(0, (n_snapshots/2)-1)
snap = storage.snapshots[random_choice*2]
if not stateA(snap) and not stateB(snap) and snap not in snapshots:
snapshots.append(snap)
In [7]:
new_store = paths.Storage("snapshots.nc", "w")
In [8]:
new_store.save(snapshots);
In [9]:
# save the old engine because we'll re-use its topology later
new_store.save(storage.engines[0]);
In [10]:
new_store.sync()
new_store.close()