Bayesian Markov state model
Variant of MarkovStateModel which estimates a distribution over transition matrices instead of a single transition matrix using Metropolis Markov chain Monte Carlo. This distribution gives information about the statistical uncertainty in the transition matrix (and functions of the transition matrix), and is stored in all_transmats_
Parameters: | lag_time : int
n_samples : int, default=100
n_steps : int, default=n_states
n_chains : int, default=n_procs
n_timescales : int, optional
reversible : bool, default=True
ergodic_cutoff : int, default=1
prior_counts : float, optional
sliding_window : bool, optional
random_state : int or RandomState instance or None (default)
sampler : {‘metzner’, ‘metzner_py’}
verbose : bool
|
---|
Notes
Markov chain Monte Carlo can be computationally expensive. To get good (converged) results and acceptable performance, you’ll likely need to play around with the n_samples, n_steps and n_chains parameters. n_samples gives the total number of transition matrices sampled from the posterior. These samples are generated from n_chains different independent MCMC chains, at an interval of n_steps. The total number of iterations of MCMC performed during fit() is n_samples * n_steps. Increasing n_chains therefore does not alter the total number of iterations – instead it controls whether those iterations occur as part of one long chain or multiple shorter chains (which are run in parallel for sampler=='metzner').
References
[R27] | P. Metzner, F. Noe and C. Schutte, “Estimating the sampling error: Distribution of transition matrices and functions of transition matrices for given trajectory data.” Phys. Rev. E 80 021106 (2009) |
Attributes
n_states_ | (int) The number of states in the model |
mapping_ | (dict) Mapping between “input” labels and internal state indices used by the counts and transition matrix for this Markov state model. Input states need not necessarily be integers in (0, ..., n_states_ - 1), for example. The semantics of mapping_[i] = j is that state i from the “input space” is represented by the index j in this MSM. |
countsmat_ | (array_like, shape = (n_states_, n_states_)) Number of transition counts between states. countsmat_[i, j] is counted during fit(). The indices i and j are the “internal” indices described above. No correction for reversibility is made to this matrix. |
transmats_ | (array_like, shape = (n_samples, n_states_, n_states_)) Samples from the posterior ensemble of transition matrices. |
Methods
fit(sequences[, y]) | |
fit_transform(X[, y]) | Fit to data, then transform it. |
get_params([deep]) | Get parameters for this estimator. |
inverse_transform(sequences) | Transform a list of sequences from internal indexing into |
set_params(**params) | Set the parameters of this estimator. |
summarize() | |
transform(sequences[, mode]) | Transform a list of sequences to internal indexing |
Implied relaxation timescales each sample in the ensemble
Returns: | timescales : array-like, shape = (n_samples, n_timescales,)
|
---|
References
[R28] | Prinz, Jan-Hendrik, et al. “Markov models of molecular kinetics: |
Generation and validation.” J. Chem. Phys. 134.17 (2011): 174105.
Eigenvalues of the transition matrices.
Returns: | eigs : array-like, shape = (n_samples, n_timescales+1)
|
---|
Left eigenvectors, \(\Phi\), of each transition matrix in the ensemble
Each transition matrix’s left eigenvectors are normalized such that:
- lv[:, 0] is the equilibrium populations and is normalized such that sum(lv[:, 0]) == 1`
- The eigenvectors satisfy sum(lv[:, i] * lv[:, i] / model.populations_) == 1. In math notation, this is \(<\phi_i, \phi_i>_{\mu^{-1}} = 1\)
Returns: | lv : array-like, shape=(n_samples, n_states, n_timescales+1)
|
---|
Right eigenvectors, \(\Psi\), of each transition matrix in the ensemble
Each transition matrix’s left eigenvectors are normalized such that:
Weighted by the stationary distribution, the right eigenvectors are normalized to 1. That is,
sum(rv[:, i] * rv[:, i] * self.populations_) == 1,
or \(<\psi_i, \psi_i>_{\mu} = 1\)
Returns: | rv : array-like, shape=(n_samples, n_states, n_timescales+1)
|
---|
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: | X : numpy array of shape [n_samples, n_features]
y : numpy array of shape [n_samples]
|
---|---|
Returns: | X_new : numpy array of shape [n_samples, n_features_new]
|
Get parameters for this estimator.
Parameters: | deep: boolean, optional
|
---|---|
Returns: | params : mapping of string to any
|
Transform a list of sequences from internal indexing into labels
Parameters: | sequences : list
|
---|---|
Returns: | sequences : list
|
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Returns: | self |
---|
Transform a list of sequences to internal indexing
Recall that sequences can be arbitrary labels, whereas transmat_ and countsmat_ are indexed with integers between 0 and n_states - 1. This methods maps a set of sequences from the labels onto this internal indexing.
Parameters: | sequences : list of array-like
mode : {‘clip’, ‘fill’}
|
---|---|
Returns: | mapped_sequences : list
|