This notebook provides an overview of the TIS analysis framework in OpenPathSampling. We start with the StandardTISAnalysis
object, which will probably meet the needs of most users. Then we go into details of how to set up custom objects for analysis, and how to assemble them into a generic TISAnalysis
object.
For an overview of TIS and this analysis framework, see http://openpathsampling.org/latest/topics/tis_analysis.html
In [1]:
# if our large test file is available, use it. Otherwise, use file generated from toy_mstis_2_run.ipynb
from __future__ import print_function
import os
test_file = "../toy_mstis_1k_OPS1.nc"
filename = "mstis.nc" # this requires newer functionality that in the standard tests
#filename = test_file if os.path.isfile(test_file) else "mstis.nc"
print('Using file ' + filename + ' for analysis')
In [2]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import openpathsampling as paths
import pandas as pd
In [3]:
%%time
storage = paths.AnalysisStorage(filename)
In [4]:
network = storage.networks[0]
scheme = storage.schemes[0]
In [5]:
stateA = storage.volumes['A']
stateB = storage.volumes['B']
stateC = storage.volumes['C']
all_states = [stateA, stateB, stateC] # all_states gives the ordering
In [6]:
from openpathsampling.analysis.tis import StandardTISAnalysis
In [7]:
# the scheme is only required if using the minus move for the flux
tis_analysis = StandardTISAnalysis(
network=network,
scheme=scheme,
max_lambda_calcs={t: {'bin_width': 0.05, 'bin_range': (0.0, 0.5)}
for t in network.sampling_transitions}
)
In [8]:
#tis_analysis.progress = 'silent'
In [9]:
%%time
rate_matrix = tis_analysis.rate_matrix(steps=storage.steps).to_pandas(order=all_states)
In [10]:
rate_matrix
Out[10]:
The rate matrix is a pandas.DataFrame
. pandas
has conveniences to easily convert that into a LaTeX table:
In [11]:
print(rate_matrix.to_latex(float_format="{:.2e}".format))
Note that there are many options for setting up the StandardTISAnalysis
object. Most customizations to the analysis can be performed by changing the initialization parameters of that object; see its documentation for details.
Once you run the rate calculation (or if you run tis_analysis.calculate(steps)
, you have already cached a large number of subcalculations. All of those are available in the results
dictionary, although the analysis object has a number of conveniences to access some of them.
Looking at the keys of the results dictionary, we can see what has been cached:
In [12]:
tis_analysis.results.keys()
Out[12]:
In practice, however, we won't go directly to the results dictionary. We'd rather use the convenience methods that make it easier to get to the interesting results.
We'll start by looking at the flux:
In [13]:
tis_analysis.flux_matrix
Out[13]:
In [14]:
s = paths.analysis.tis.flux_matrix_pd(tis_analysis.flux_matrix)
pd.DataFrame(s)
Out[14]:
Next we look at the total crossing probability (i.e., the crossing probability, joined by WHAM) for each sampled transition. We could also look at this per physical transition, but of course $A\to B$ and $A\to C$ are identical in MSTIS -- only the initial state matters.
In [15]:
# if you don't like the "Flux" label in a separate line, fix it by hand
print(s.to_latex(float_format='{:.4f}'.format))
Next we look at the total crossing probability (i.e., the crossing probability, joined by WHAM) for each sampled transition. We could also look at this per physical transition, but of course $A\to B$ and $A\to C$ are identical in MSTIS -- only the initial state matters.
In [16]:
for transition in network.sampling_transitions:
label = transition.name
tcp = tis_analysis.total_crossing_probability[transition]
plt.plot(tcp.x, np.log(tcp), label=label)
plt.title("Total Crossing Probability")
plt.xlabel("$\lambda$")
plt.ylabel("$\ln(P(\lambda | X_0))$")
plt.legend();
We may want to look in more detail at one of these, by checking the per-ensemble crossing probability (as well at the total crossing probability). Here we select based on the $A\to B$ transition, we would get the same results if we selected the transition using either trans = network.from_state[stateA]
or trans = network.transitions[(stateA, stateC)]
.
In [17]:
state_pair = (stateA, stateB)
trans = network.transitions[state_pair]
for ens in trans.ensembles:
crossing = tis_analysis.crossing_probability(ens)
label = ens.name
plt.plot(crossing.x, np.log(crossing), label=label)
tcp = tis_analysis.total_crossing_probability[state_pair]
plt.plot(tcp.x, np.log(tcp), '-k', label="total crossing probability")
plt.title("Crossing Probabilities, " + stateA.name + " -> " + stateB.name)
plt.xlabel("$\lambda$")
plt.ylabel("$\ln(P_A(\lambda | \lambda_i))$")
plt.legend();
Finally, we look at the last part of the rate calculation: the conditional transition probability. This is calculated for the outermost interface in each interface set.
In [18]:
tis_analysis.conditional_transition_probability
Out[18]:
The combined analysis is the easiest way to perform analysis, but if you need to customize things (or if you want to compare different calculation methods) you might want to create objects for components of the analysis individually. Note that unlike the StandardTISAnalysis
object, these do not cache their intermediate results.
In [19]:
from openpathsampling.analysis.tis import MinusMoveFlux
In [20]:
flux_calc = MinusMoveFlux(scheme)
To calculate the fluxes, we use the .calculate
method of the MinusMoveFlux
object:
In [21]:
%%time
fluxes = flux_calc.calculate(storage.steps)
In [22]:
fluxes
Out[22]:
This is in the same format as the flux_matrix
given above, and we can convert it to a pandas.DataFrame
in the same way. This can then be converted to a LaTeX table to copy-paste into an article in the same way as was done earlier.
In [23]:
pd.DataFrame(paths.analysis.tis.flux_matrix_pd(fluxes))
Out[23]:
The minus move flux calculates some intermediate information along the way, which can be of use for further analysis. This is cached when using the StandardTISAnalysis
, but can always be recalculated. The intermediate maps each (state, interface)
pair to a dictionary. For details on the structure of that dictionary, see the documentation of TrajectoryTransitionAnalysis.analyze_flux
.
In [24]:
%%time
flux_dicts = flux_calc.intermediates(storage.steps)[0]
In [25]:
flux_dicts
Out[25]:
The DictFlux
class (which is required for MISTIS, and often provides better statistics than the minus move flux in other cases) takes a pre-calculated flux dictionary for initialization, and always returns that dictionary. The dictionary is in the same format as the fluxes
returned by the MinusMoveFlux.calculate
method; here, we'll just use the results we calculated above:
In [26]:
from openpathsampling.analysis.tis import DictFlux
In [27]:
dict_flux = DictFlux(fluxes)
In [28]:
dict_flux.calculate(storage.steps)
Out[28]:
Note that DictFlux.calculate
just echoes back the dictionary we gave it, so it doesn't actually care if we give it the steps argument or not:
In [29]:
dict_flux.calculate(None)
Out[29]:
This object can be used to provide the flux part of the TIS calculation, in exactly the same way a MinusMoveFlux
object does.
To calculate the total crossing probability, we must first calculate the individual ensemble crossing probabilities. This is done by creating a histogram of the maximum values of the order parameter. The class to do that is FullHistogramMaxLambdas
. Then we'll create the TotalCrossingProbability
.
In [30]:
transition = network.sampling_transitions[0]
In [31]:
print(transition)
In [32]:
from openpathsampling.analysis.tis import FullHistogramMaxLambdas, TotalCrossingProbability
from openpathsampling.numerics import WHAM
In [33]:
max_lambda_calc = FullHistogramMaxLambdas(
transition=transition,
hist_parameters={'bin_width': 0.05, 'bin_range': (0.0, 0.5)}
)
We can also change the function used to calculate the maximum value of the order parameter with the max_lambda_func
parameter. This can be useful to calculate the crossing probabilities along some other order parameter.
To calculate the total crossing probability function, we also need a method for combining the ensemble crossing probability functions. We'll use the default WHAM here; see its documentation for details on how it can be customized.
In [34]:
combiner = WHAM(interfaces=transition.interfaces.lambdas)
Now we can put these together into the total crossing probability function:
In [35]:
total_crossing = TotalCrossingProbability(
max_lambda_calc=max_lambda_calc,
combiner=combiner
)
In [36]:
tcp = total_crossing.calculate(storage.steps)
In [37]:
plt.plot(tcp.x, np.log(tcp))
plt.title("Total Crossing Probability, exiting " + transition.stateA.name)
plt.xlabel("$\lambda$")
plt.ylabel("$\ln(P_A(\lambda | \lambda_i))$");
In [38]:
from openpathsampling.analysis.tis import ConditionalTransitionProbability
In [39]:
outermost_ensembles = [trans.ensembles[-1] for trans in network.sampling_transitions]
In [40]:
cond_transition = ConditionalTransitionProbability(
ensembles=outermost_ensembles,
states=network.states
)
In [41]:
ctp = cond_transition.calculate(storage.steps)
In [42]:
ctp
Out[42]:
StandardTISAnalysis.conditional_transition_probability
converts this into a pandas.DataFrame
, which gives prettier printing. However, the same data in included in this dict-of-dict structure.
If you're using the "standard" TIS approach, then the StandardTISAnalysis
object is the most efficient way to do it. However, if you want to use another analysis approach, it can be useful to see how the "standard" approach can be assembled.
This won't have all the shortcuts or saved intermediates that the specialized object does, but it will use the same algorithms to get the same results.
In [43]:
from openpathsampling.analysis.tis import StandardTransitionProbability, TISAnalysis
Some of the objects that we created in previous sections can be reused here. In particular, there is only only one flux calculation and only one conditional transitional transition probability per reaction network. However, the total crossing probability method is dependent on the transition (different order parameters might have different histrogram parameters). So we need to associate each transition with a different TotalCrossingProbability
object. In this example, we take the default behavior of WHAM (instead of specifying in explicitly, as above).
In [44]:
tcp_methods = {
trans: TotalCrossingProbability(
max_lambda_calc=FullHistogramMaxLambdas(
transition=trans,
hist_parameters={'bin_width': 0.05, 'bin_range': (0.0, 0.5)}
)
)
for trans in network.transitions.values()
}
The general TISAnalysis
object makes the most simple splitting: flux and transition probability. A single flux calculation is used for all transitions, but each transition has a different transition probability (since each transition can have a different total crossing probability). We make those objects here.
In [45]:
transition_probability_methods = {
trans: StandardTransitionProbability(
transition=trans,
tcp_method=tcp_methods[trans],
ctp_method=cond_transition
)
for trans in network.transitions.values()
}
Finally we put this all together into a TISAnalysis
object, and calculate the rate matrix.
In [46]:
analysis = TISAnalysis(
network=network,
flux_method=dict_flux,
transition_probability_methods=transition_probability_methods
)
In [47]:
analysis.rate_matrix(storage.steps)
Out[47]:
This is the same rate matrix as we obtained with the StandardTISAnalysis
. It is a little faster because we used the precalculated DictFlux
object in this instead of the MinusMoveFlux
, otherwise this would be slower.
In [ ]: