msmbuilder.reduce.tICA.tICA

class msmbuilder.reduce.tICA.tICA(lag, calc_cov_mat=True, prep_metric=None, size=None)[source]

tICA is a class for calculating the matrices required to do time-structure based independent component analysis (tICA). It can be used to calculate both the time-lag correlation matrix and covariance matrix. The advantage it has is that you can calculate the matrix for a large dataset by “training” smaller pieces of the dataset at a time.

Notes

It can be shown that the time-lag correlation matrix is the same as:

C = E[Outer(X[t], X[t+lag])] - Outer(E[X[t]], E[X[t+lag]])

Because of this it is possible to calculate running sums corresponding to variables A, B, D:

A = E[X[t]] B = E[X[t+lag]] D = E[Outer(X[t], X[t+lag])]

Then at the end we can calculate C:

C = D - Outer(A, B)

Finally we can get a symmetrized C’ from our estimate of C, for example by adding the transpose:

C’ = (C + C^T) / 2

There is, in fact, an MLE estimator for ech matrix C, and S:

S = E[Outer(X[t], X[t])]

The MLE estimators are:

mu = 1 / (2(N - lag)) sum_{t=1}^{N - lag} X[t] + X[t + lag]

C = 1 / (2(N - lag)) * sum_{t=1}^{N - lag} Outer(X[t] - mu, X[t + lag] - mu) + Outer(X[t + lag] - mu, X[t] - mu)

S = 1 / (2(N - lag)) * sum_{t=1}^{N - lag} Outer(X[t] - mu, X[t] - mu) + Outer(X[t + lag] - mu, X[t + lag] - mu)

__init__(lag, calc_cov_mat=True, prep_metric=None, size=None)[source]

Create an empty tICA object.

To add data to the object, use the train method.

Parameters:

lag: int :

The lag to use in calculating the time-lag correlation matrix. If zero, then only the covariance matrix is calculated

calc_cov_mat: bool, optional :

if lag > 0, then will also calculate the covariance matrix

prep_metric: msmbuilder.metrics.Vectorized subclass instance, optional :

metric to use to prepare trajectories. If not specified, then you must pass prepared trajectories to the train method, via the kwarg “prep_trajectory”

size: int, optional :

the size is the number of coordinates for the vector representation of the protein. If None, then the first trained vector will be used to initialize it.

Notes

To load an already constructed tICA object, use tICA.load().

Methods

__init__(lag[, calc_cov_mat, prep_metric, size]) Create an empty tICA object.
get_current_estimate Calculate the current estimate of the time-lag correlation matrix and the covariance matrix (if asked for).
initialize(size) initialize the containers for the calculation
load(cls, tica_fn) load a tICA solution to use in projecting data.
project([trajectory, prep_trajectory, which]) project a trajectory (or prepared trajectory) onto a subset of
save(output) save the results to file
solve([pca_cutoff]) Solve the eigenvalue problem.
train([trajectory, prep_trajectory]) add a trajectory to the calculation
Versions