Combining SequentialEnsembles with SlicedTrajectoryEnsembles

The ensemble creation features in OpenPathSampling are very powerful, but the ways that some of them interact with each other can be quite confusing. In this document, we explore what happens when one combines SequentialEnsembles and SlicedTrajectoryEnsembles.

First, let's summarize what each of these ensembles do. SequentialEnsembles take a list of ensembles and applies them in order, so a trajectory satisfies the first ensemble for as long as it can, then it switches to the next ensemble, and so forth. SlicedTrajectoryEnsembles modify the trajectory seen by a given ensemble by applying a Python slice object to select only certain frames from the ensemble.

There are several ways of slicing SequentialEnsembles; the purpose of this document is to explain the differences.


To make discussion easy, we'll use a fake order parameter and fake trajectories in this example. We'll also define all our ensembles in terms of AllInXEnsembles and AllOutXEnsembles, which are easier to visualize.

When dealing with SequentialEnsembles, we should distinguish between two types of trajectories. There's the total trajectory for the ensemble, which the normal trajectory including all frames. However, each subensemble of the SequentialEnsemble applies to a subtrajectory. The key to the subtleties of how SequentialEnsembles interact with SlicedTrajectoryEnsembles is to consider the which of these trajectories we really want to slice.

First, we import the necessary things and set up our order parameter and our initial ensemble:


In [1]:
from openpathsampling.ensemble import SlicedTrajectoryEnsemble, SequentialEnsemble, AllInXEnsemble, AllOutXEnsemble, LengthEnsemble
from openpathsampling.collectivevariable import FunctionCV
from openpathsampling.volume import CVDefinedVolume
from openpathsampling.engines import Trajectory

# This is a hack to easily create test sequences that act as "trajectories" for us
from openpathsampling.tests.test_helpers import CallIdentity
op = CallIdentity()
vol = CVDefinedVolume(op, -0.5, 0.5)

ens = SequentialEnsemble([
    AllInXEnsemble(vol), 
    AllOutXEnsemble(vol), 
    AllInXEnsemble(vol) & LengthEnsemble(1)
])

inV = 0.0
outV = 1.0

Slicing the global trajectory for the whole SequentialEnsemble

One approach we might need is to apply the whole SequentialEnsemble on some appropriate sliced trajectory. In our example, this translates as saying that the even slices of the total trajectory must satisfy the In-Out-In ensemble.

This is quite easy to implement:


In [2]:
even_slice = slice(None,None,2)
total_sliced_whole = SlicedTrajectoryEnsemble(ens, even_slice)

Slicing the subtrajectory for a member of the SequentialEnsemble

Another approach we might desire is to slice the subtrajectory seen by one of the members of the SequentialEnsemble. Perhaps, for example, we want an ensemble which consists of a segment with all frames inside the state, then a segment where the even frames of that segement are outside the state, and then a segment with all frames inside the state.

Note carefully here that the slicing refers to the subtrajectory: it's the even frames counting from the beginning of the subtrajectory, regardless of how many frames are in the total trajectory.

In this case, we have a define a slightly different SequentialEnsemble:


In [3]:
subtraj_sliced_member = SequentialEnsemble([
    AllInXEnsemble(vol),
    SlicedTrajectoryEnsemble(AllOutXEnsemble(vol), even_slice),
    AllInXEnsemble(vol) & LengthEnsemble(1)
])

Slicing the total trajectory for a member of the SequentialEnsemble

The last approach would be to use a slice based on the total trajectory, but only apply it to a member of the SequentialEnsemble. For example, we might want to have an ensemble that begins in a state, and after the first exit all even frames (counted from the start of the trajectory) are outside the state. This differs from the previous example because the count is based on the total trajectory, not the subtrajectory which is passed to the ensemble. To do this, we need to use a SlicedSequentialEnsemble, a subclass of SequentialEnsemble which requires initializing with a list containing one slice for each member ensemble (the special case None is turned into slice(None,None)).


In [4]:
# TODO: We have yet to implement SlicedSequentialEnsemble. 
# This would be nice for completeness, but it really shouldn't be a priority.

Example trajectories for these ensembles

Let's take these abstract ideas and apply them to some real trajectories to highlight the differences between these ensembles.


In [5]:
traj = {}
traj[0] = Trajectory([inV, outV, inV])
traj[1] = Trajectory([inV, inV, outV, outV, inV, inV])
traj[2] = Trajectory([inV, outV, outV, outV, inV])
traj[3] = Trajectory([inV, outV, inV, outV, inV, outV, inV, inV])
traj[4] = Trajectory([inV, outV, outV, inV, outV, inV, outV, inV, inV])
traj[5] = Trajectory([inV, outV, outV, inV, outV, inV, outV, inV])
traj[6] = Trajectory([inV, outV, inV, outV, inV, inV])
traj[7] = Trajectory([inV, outV, outV, outV, inV, outV])

First, let's use the original ensemble ens: just from inspection, it should be pretty clear that traj[0] and traj[2] are accepted by that ensemble, and the others are not. Let's confirm that:


In [6]:
for i in range(len(traj)):
    print "ens(traj["+str(i)+"]) ==", ens(traj[i])


ens(traj[0]) == True
ens(traj[1]) == False
ens(traj[2]) == True
ens(traj[3]) == False
ens(traj[4]) == False
ens(traj[5]) == False
ens(traj[6]) == False
ens(traj[7]) == False

What about when we slice the whole trajectory into its even components as with total_sliced_whole? Remembering that Python counts from 0, that means a trajectory like traj[4] becomes "in, out, out, out, in", which satifies the original ensemble. What happens with the other ensembles?


In [7]:
for i in range(len(traj)):
    print "total_sliced_whole(traj["+str(i)+"]) ==", total_sliced_whole(traj[i])


total_sliced_whole(traj[0]) == False
total_sliced_whole(traj[1]) == True
total_sliced_whole(traj[2]) == True
total_sliced_whole(traj[3]) == False
total_sliced_whole(traj[4]) == True
total_sliced_whole(traj[5]) == False
total_sliced_whole(traj[6]) == False
total_sliced_whole(traj[7]) == True

Notice that traj[0] no longer works: it turned into "in, in", which does not satisfy the ensemble. Not surprisingly, 1, 2, and 4 satisfy the ensemble, and 3 and 6 don't (since all their "out" frames are odd).

But what about traj[5]? It seems like it should satisfy the ensemble: it starts and ends in the state, and all the even frames between are outside the state. But the problem is that the final frame is an odd-numbered frame, and therefore it gets removed by the slicing before the ensemble sees it. This is the same reason that traj[4] is accepted despite violating the LengthEnsemble(1) condition for the last ensemble of the sequence.

Let's also look more carefully at traj[7]: it may seem odd that this one is accepted, since it ends with a frame outside the volume. But again, the trajectory that the ensemble actually sees after the slicing doesn't include that frame, because it is an odd-numbered frame.

Next, we consider the case of subtraj_sliced_member, where we slice based on the subtrajectory instead of the total trajectory. Let's see what comes out of that:


In [8]:
for i in range(len(traj)):
    print "subtraj_sliced_member(traj["+str(i)+"]) ==", subtraj_sliced_member(traj[i])


subtraj_sliced_member(traj[0]) == False
subtraj_sliced_member(traj[1]) == False
subtraj_sliced_member(traj[2]) == False
subtraj_sliced_member(traj[3]) == True
subtraj_sliced_member(traj[4]) == False
subtraj_sliced_member(traj[5]) == False
subtraj_sliced_member(traj[6]) == True
subtraj_sliced_member(traj[7]) == False

One important aspect about this is that SequentialEnsemble uses a hungry matching algorithm, meaning that each ensemble matches as much of the trajectory as it can before going on to the next ensemble. So, for example, even though traj[6] ends with two frames inside the ensemble, the sliced member of the ensemble includes the first in-volume as part of its matching.

Perhaps the clearest way to explain the differences is to make a table of what frames each ensemble sees for these trajectories:

traj ens total_sliced_whole subtraj_sliced_member
0 0:in 1:out 2:in 0:in 2:in 0:in 1:out
1 0:in 1:in 2:out 3:out 4:in 5:in 0:in 2:out 4:in 0:in 1:in 2:out 4:in 5:in
2 0:in 1:out 2:out 3:out 4:in 0:in 2:out 4:in 0:in 1:out 3:out
3 0:in 1:out 2:in 3:out [4:in 5:out 6:in 7:in] 0:in 2:in 4:in 6:in 0:in 1:out 3:out 5:out 7:in
4 0:in 1:out 2:out 3:in 4:out [5:in 6:out 7:in 8:in] 0:in 2:out 4:out 6:out 8:in 0:in 1:out 3:in 4:out [5:in 6:out 7:in 8:in]
5 0:in 1:out 2:out 3:in 4:out [5:in 6:out 7:in] 0:in 2:out 4:out 6:out 0:in 1:out 3:in 4:out [5:in 6:out 7:in]
6 0:in 1:out 2:in 3:out [4:in 5:in] 0:in 2:in 4:in 0:in 1:out 3:out 5:in
7 0:in 1:out 2:out 3:out 4:in 5:out 0:in 2:out 4:in 0:in 1:out 3:out 5:out

Brackets mark frames that aren't actually tested by the ensemble, because the trajectory has already failed. However, they're listed according to the last applied slice.

Nested Slices

What I have termed the "total" trajectory is only total in the sense that it was what was passed to the ensemble. If you nest SlicedTrajectoryEnsembles, the inner slice will be based on the output from the outer slice. For example, if you made an odd_slice along the same lines of the even_slice defined above, then SlicedTrajectoryEnsemble(ensemble SlicedTrajectoryEnsemble(ensemble, even_slice), odd_slice) would apply to frames [2, 6, 10, ...]: the even numbers with odd factors. The slices are applied sequentially. There is no way general way to combine slices: you just have to make the slice you want. So SlicedTrajectoryEnsemble(ensemble, even_slice) | SlicedTrajectoryEnsemble(ensemble, odd_slice) is not the same as ensemble: the combination of SlicedTrajectoryEnsembles looks at each of the subtrajectories, not at the total trajectory you would expect if you or'd together the two slices.


In [ ]: