Reversible Gaussian Hidden Markov Model L1-Fusion Regularization
This model estimates Hidden Markov model for a vector dataset which is contained to be reversible (satisfy detailed balance) with Gaussian emission distributions. This model is similar to a MarkovStateModel without a “hard” assignments of conformations to clusters. Optionally, it can apply L1-regularization to the positions of the Gaussians. See [1] for details.
Parameters: | n_states : int
n_init : int
n_em_iter : int
n_lqa_iter : int
thresh : float
fusion_prior : float
reversible_type : str
transmat_prior : float, optiibal
vars_prior : float, optional
vars_weight : float, optional
random_state : int, optional
params : str
init_params : str
timing : bool, default=False
n_hotstart : {int, ‘all’}
init_algo : str
|
---|
References
[R24] | McGibbon, Robert T. et al., “Understanding Protein Dynamics with L1-Regularized Reversible Hidden Markov Models” Proc. 31st Intl. Conf. on Machine Learning (ICML). 2014. |
Attributes
means_ : | |
vars_ : | |
transmat_ : | |
populations_ : | |
fit_logprob_ : |
Methods
draw_centroids(sequences) | Find conformations most representative of model means. |
draw_samples(sequences, n_samples[, scheme, ...]) | Sample conformations from each state. |
fit(sequences[, y]) | Estimate model parameters. |
get_params([deep]) | Get parameters for this estimator. |
predict(sequences) | Find most likely hidden-state sequence corresponding to each data timeseries. |
score(sequences) | Log-likelihood of sequences under the model |
set_params(**params) | Set the parameters of this estimator. |
summarize() |
Estimate model parameters.
An initialization step is performed before entering the EM algorithm. If you want to avoid this step, pass proper init_params keyword argument to estimator’s constructor.
Parameters: | sequences : list
y : unused
|
---|
Get parameters for this estimator.
Parameters: | deep: boolean, optional
|
---|---|
Returns: | params : mapping of string to any
|
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Returns: | self |
---|
The implied relaxation timescales of the hidden Markov transition matrix
By diagonalizing the transition matrix, its propagation of an arbitrary initial probability vector can be written as a sum of the eigenvectors of the transition weighted by per-eigenvector term that decays exponentially with time. Each of these eigenvectors describes a “dynamical mode” of the transition matrix and has a characteristic timescale, which gives the timescale on which that mode decays towards equilibrium. These timescales are given by \(-1/log(u_i)\) where \(u_i\) are the eigenvalues of the transition matrix. In a reversible HMM with N states, the number of timescales is at most N-1. (The -1 comes from the fact that the stationary distribution of the chain is associated with an eigenvalue of 1, and an infinite characteristic timescale). The number of timescales can be less than N-1 for every eigenvalue of the transition matrix that is negative (which is allowable by detailed balance).
Returns: | timescales : array, shape=[n_timescales]
|
---|
Log-likelihood of sequences under the model
Parameters: | sequences : list
|
---|
Find most likely hidden-state sequence corresponding to each data timeseries.
Uses the Viterbi algorithm.
Parameters: | sequences : list
|
---|---|
Returns: | viterbi_logprob : float
hidden_sequences : list of np.ndarrays[dtype=int, shape=n_samples_i]
|
Find conformations most representative of model means.
Parameters: | sequences : list
|
---|---|
Returns: | centroid_pairs_by_state : np.ndarray, dtype=int, shape = (n_states, 1, 2)
mean_approx : np.ndarray, dtype=float, shape = (n_states, 1, n_features)
|
See also
Sample conformations from each state.
Parameters: | sequences : list
n_samples : int
scheme : str, optional, default=’even’
match_vars : bool, default=False
|
---|---|
Returns: | selected_pairs_by_state : np.array, dtype=int, shape=(n_states, n_samples, 2)
sample_features : np.ndarray, dtype=float, shape = (n_states, n_samples, n_features)
|
See also
Notes
With scheme=’even’, this function assigns frames to states crisply then samples from the uniform distribution on the frames belonging to each state. With scheme=’maxent’, this scheme uses a maximum entropy method to determine a discrete distribution on samples whose mean (and possibly variance) matches the GHMM means.