Time-structure Independent Component Analysis (tICA)
Linear dimensionality reduction using an eigendecomposition of the time-lag correlation matrix and covariance matrix of the data and keeping only the vectors which decorrelate slowest to project the data into a lower dimensional space.
Parameters: | n_components : int, None
|
---|
Notes
This method was introduced originally in [R20], and has been applied to the analysis of molecular dynamics data in [R17], [R18], and [R19]. In [R17] and [R18], tICA was used as a dimensionality reduction technique before fitting other kinetic models.
Attributes
components_ | (array-like, shape (n_components, n_features)) Components with maximum autocorrelation. offset_correlation_ : array-like, shape (n_features, n_features) Symmetric time-lagged correlation matrix, \(C=E[(x_t)^T x_{t+lag}]\). eigenvalues_ : array-like, shape (n_features,) Eigenvalues of the tICA generalized eigenproblem, in decreasing order. eigenvectors_ : array-like, shape (n_components, n_features) Eigenvectors of the tICA generalized eigenproblem. The vectors give a set of “directions” through configuration space along which the system relaxes towards equilibrium. Each eigenvector is associated with characteritic timescale :math:`- |
rac{lag_time}{ln lambda_i}, where \(lambda_i\) is | the corresponding eigenvector. See [2] for more information. means_ : array, shape (n_features,) The mean of the data along each feature n_observations_ : int Total number of data points fit by the model. Note that the model is “reset” by calling fit() with new sequences, whereas partial_fit() updates the fit with new data, and is suitable for online learning. n_sequences_ : int Total number of sequences fit by the model. Note that the model is “reset” by calling fit() with new sequences, whereas partial_fit() updates the fit with new data, and is suitable for online learning. timescales_ : array-like, shape (n_features,) The implied timescales of the tICA model, given by -offset / log(eigenvalues) |
Methods
fit(sequences[, y]) | Fit the model with a collection of sequences. |
fit_transform(sequences[, y]) | Fit the model with X and apply the dimensionality reduction on X. |
get_params([deep]) | Get parameters for this estimator. |
partial_fit(X) | Fit the model with X. |
partial_transform(features) | Apply the dimensionality reduction on X. |
score(sequences[, y]) | Score the model on new data using the generalized matrix Rayleigh quotient |
set_params(**params) | Set the parameters of this estimator. |
summarize() | Some summary information. |
transform(sequences) | Apply the dimensionality reduction on X. |
Training score of the model, computed as the generalized matrix, Rayleigh quotient, the sum of the first n_components eigenvalues
Fit the model with a collection of sequences.
This method is not online. Any state accumulated from previous calls to fit() or partial_fit() will be cleared. For online learning, use partial_fit.
Parameters: | sequences: list of array-like, each of shape (n_samples_i, n_features)
y : None
|
---|---|
Returns: | self : object
|
Fit the model with X.
This method is suitable for online learning. The state of the model will be updated with the new data X.
Parameters: | X: array-like, shape (n_samples, n_features)
|
---|---|
Returns: | self : object
|
Apply the dimensionality reduction on X.
Parameters: | sequences: list of array-like, each of shape (n_samples_i, n_features)
|
---|---|
Returns: | sequence_new : list of array-like, each of shape (n_samples_i, n_components) |
Apply the dimensionality reduction on X.
Parameters: | features: array-like, shape (n_samples, n_features)
|
---|---|
Returns: | sequence_new : array-like, shape (n_samples, n_components)
|
Notes
This function acts on a single featurized trajectory.
Fit the model with X and apply the dimensionality reduction on X.
This method is not online. Any state accumulated from previous calls to fit() or partial_fit() will be cleared. For online learning, use partial_fit.
Parameters: | sequences: list of array-like, each of shape (n_samples_i, n_features)
y : None
|
---|---|
Returns: | sequence_new : list of array-like, each of shape (n_samples_i, n_components) |
Score the model on new data using the generalized matrix Rayleigh quotient
Parameters: | sequences : list of array-like
|
---|---|
Returns: | gmrq : float
|
References
[R21] | McGibbon, R. T. and V. S. Pande, “Variational cross-validation of slow dynamical modes in molecular kinetics” http://arxiv.org/abs/1407.8083 (2014) |
Some summary information.
Get parameters for this estimator.
Parameters: | deep: boolean, optional
|
---|---|
Returns: | params : mapping of string to any
|
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Returns: | self |
---|