Featurizer based on euclidian atom distances to reference structure.
This featurizer transforms a dataset containing MD trajectories into a vector dataset by representing each frame in each of the MD trajectories by a vector containing the distances from a specified set of atoms to the 'reference position' of those atoms, in reference_traj.
Parameters
class MarkovStateModel(BaseEstimator, _MappingTransformMixin, _SampleMSMMixin):
"""Reversible Markov State Model
This model fits a first-order Markov model to a dataset of integer-valued
timeseries. The key estimated attribute, ``transmat_`` is a matrix
containing the estimated probability of transitioning between pairs
of states in the duration specified by ``lag_time``.
Unless otherwise specified, the model is constrained to be reversible
(satisfy detailed balance), which is appropriate for equilibrium chemical
systems.
Parameters
----------
lag_time : int
The lag time of the model
n_timescales : int, optional
The number of dynamical timescales to calculate when diagonalizing
the transition matrix.
reversible_type : {'mle', 'transpose', None}
Method by which the reversibility of the transition matrix
is enforced. 'mle' uses a maximum likelihood method that is
solved by numerical optimization, and 'transpose'
uses a more restrictive (but less computationally complex)
direct symmetrization of the expected number of counts.
ergodic_cutoff : float or {'on', 'off'}, default='on'
Only the maximal strongly ergodic subgraph of the data is used to build
an MSM. Ergodicity is determined by ensuring that each state is
accessible from each other state via one or more paths involving edges
with a number of observed directed counts greater than or equal to
``ergodic_cutoff``. By setting ``ergodic_cutoff`` to 0 or
'off', this trimming is turned off. Setting it to 'on' sets the
cutoff to the minimal possible count value.
prior_counts : float, optional
Add a number of "pseudo counts" to each entry in the counts matrix
after ergodic trimming. When prior_counts == 0 (default), the assigned
transition probability between two states with no observed transitions
will be zero, whereas when prior_counts > 0, even this unobserved
transitions will be given nonzero probability.
sliding_window : bool, optional
Count transitions using a window of length ``lag_time``, which is slid
along the sequences 1 unit at a time, yielding transitions which
contain more data but cannot be assumed to be statistically
independent. Otherwise, the sequences are simply subsampled at an
interval of ``lag_time``.
verbose : bool
Enable verbose printout
References
----------
.. [1] Prinz, Jan-Hendrik, et al. "Markov models of molecular kinetics:
Generation and validation." J Chem. Phys. 134.17 (2011): 174105.
.. [2] Pande, V. S., K. A. Beauchamp, and G. R. Bowman. "Everything you
wanted to know about Markov State Models but were afraid to ask"
Methods 52.1 (2010): 99-105.
Attributes
----------
n_states_ : int
The number of states in the model
mapping_ : dict
Mapping between "input" labels and internal state indices used by the
counts and transition matrix for this Markov state model. Input states
need not necessarily be integers in (0, ..., n_states_ - 1), for
example. The semantics of ``mapping_[i] = j`` is that state ``i`` from
the "input space" is represented by the index ``j`` in this MSM.
countsmat_ : array_like, shape = (n_states_, n_states_)
Number of transition counts between states. countsmat_[i, j] is counted
during `fit()`. The indices `i` and `j` are the "internal" indices
described above. No correction for reversibility is made to this
matrix.
transmat_ : array_like, shape = (n_states_, n_states_)
Maximum likelihood estimate of the reversible transition matrix.
The indices `i` and `j` are the "internal" indices described above.
populations_ : array, shape = (n_states_,)
The equilibrium population (stationary eigenvector) of transmat_
"""
In [ ]: