functions studying

SuperposeFeaturizer

Featurizer based on euclidian atom distances to reference structure.

This featurizer transforms a dataset containing MD trajectories into a vector dataset by representing each frame in each of the MD trajectories by a vector containing the distances from a specified set of atoms to the 'reference position' of those atoms, in reference_traj.

Parameters

atom_indices : np.ndarray, shape=(n_atoms,), dtype=int
reference_traj : md.Trajectory (only the first frame in reference_traj is used)
superpose_atom_indices : np.ndarray, shape=(n_atoms,), dtype=int

KCenters

MarkovStateModel

class MarkovStateModel(BaseEstimator, _MappingTransformMixin, _SampleMSMMixin):
    """Reversible Markov State Model

    This model fits a first-order Markov model to a dataset of integer-valued
    timeseries. The key estimated attribute, ``transmat_`` is a matrix
    containing the estimated probability of transitioning between pairs
    of states in the duration specified by ``lag_time``.

    Unless otherwise specified, the model is constrained to be reversible
    (satisfy detailed balance), which is appropriate for equilibrium chemical
    systems.

    Parameters
    ----------
    lag_time : int
        The lag time of the model
    n_timescales : int, optional
        The number of dynamical timescales to calculate when diagonalizing
        the transition matrix.
    reversible_type : {'mle', 'transpose', None}
        Method by which the reversibility of the transition matrix
        is enforced. 'mle' uses a maximum likelihood method that is
        solved by numerical optimization, and 'transpose'
        uses a more restrictive (but less computationally complex)
        direct symmetrization of the expected number of counts.
    ergodic_cutoff : float or {'on', 'off'}, default='on'
        Only the maximal strongly ergodic subgraph of the data is used to build
        an MSM. Ergodicity is determined by ensuring that each state is
        accessible from each other state via one or more paths involving edges
        with a number of observed directed counts greater than or equal to
        ``ergodic_cutoff``. By setting ``ergodic_cutoff`` to 0 or
        'off', this trimming is turned off. Setting it to 'on' sets the
        cutoff to the minimal possible count value.
    prior_counts : float, optional
        Add a number of "pseudo counts" to each entry in the counts matrix
        after ergodic trimming.  When prior_counts == 0 (default), the assigned
        transition probability between two states with no observed transitions
        will be zero, whereas when prior_counts > 0, even this unobserved
        transitions will be given nonzero probability.
    sliding_window : bool, optional
        Count transitions using a window of length ``lag_time``, which is slid
        along the sequences 1 unit at a time, yielding transitions which
        contain more data but cannot be assumed to be statistically
        independent. Otherwise, the sequences are simply subsampled at an
        interval of ``lag_time``.
    verbose : bool
        Enable verbose printout

    References
    ----------
    .. [1] Prinz, Jan-Hendrik, et al. "Markov models of molecular kinetics:
       Generation and validation." J Chem. Phys. 134.17 (2011): 174105.
    .. [2] Pande, V. S., K. A. Beauchamp, and G. R. Bowman. "Everything you
       wanted to know about Markov State Models but were afraid to ask"
       Methods 52.1 (2010): 99-105.

    Attributes
    ----------
    n_states_ : int
        The number of states in the model
    mapping_ : dict
        Mapping between "input" labels and internal state indices used by the
        counts and transition matrix for this Markov state model. Input states
        need not necessarily be integers in (0, ..., n_states_ - 1), for
        example. The semantics of ``mapping_[i] = j`` is that state ``i`` from
        the "input space" is represented by the index ``j`` in this MSM.
    countsmat_ : array_like, shape = (n_states_, n_states_)
        Number of transition counts between states. countsmat_[i, j] is counted
        during `fit()`. The indices `i` and `j` are the "internal" indices
        described above. No correction for reversibility is made to this
        matrix.
    transmat_ : array_like, shape = (n_states_, n_states_)
        Maximum likelihood estimate of the reversible transition matrix.
        The indices `i` and `j` are the "internal" indices described above.
    populations_ : array, shape = (n_states_,)
        The equilibrium population (stationary eigenvector) of transmat_
    """