Continuous-time Markov State Model

Algorithms

ContinuousTimeMSM([lag_time, n_timescales, ...]) Reversible first order master equation model

Theory

Consider an n-state time-homogeneous Markov process, \(X(t)\) At time \(t\), the \(n\)-vector \(P(t) = Pr[ X(t) = i ]\) is probability that the system is in each of the \(n\) states. These probabilities evolve forward in time, governed by an \(n \times n\) transition rate matrix \(K\)

\[dP(t)/dt = P(t) \cdot K\]

The solution is

\[P(t) = \exp(tK) \cdot P(0)\]

Where \(\exp(tK)\) is the matrix exponential. Written differently, the state-to-state lag-\(\tau\) transition probabilities are

\[Pr[ X(t+\tau) = j \;|\; X(t) = i ] = exp(\tau K)_{ij}\]

For this model, we observe the evolution of one or more chains, \(X(t)\) at a regular interval, \(\tau\). Let \(C_{ij}\) be the number of times the chain was observed at state \(i\) at time \(t\) and at state \(j\) at time \(t+\tau\) (the number of observed transition counts). Suppose that \(K\) depends on a length-\(b\) parameter vector, \(\theta\). The log-likelihood is

\[\mathcal{L}(\theta) = \sum_{ij} \left[ C_{ij} \log\left(\left[\exp(\tau K(\theta))\right]_{ij}\right)\right]\]

Dense Parameterization

This code parameterizes \(K(\theta)\) such that \(K\) is constrained to satisfy detailed balance with respect to a stationary distribution, \(\pi\). For an \(n\)-state model, \(\theta\) is of length \(n(n-1)/2 + n\). The first \(n(n-1)/2\) elements of \(\theta\) are the parameterize the symmetric rate matrix, \(S\), and the remaining \(n\) entries parameterize the stationary distribution. For \(n=4\), the function \(K(\theta)\) is

\[\begin{split}\begin{array}a s_u = \exp(\theta_u) &\text{for} &u = 0, 1, \ldots, n(n-1)/2 \\ \pi_i = \exp(\theta_{i + n(n-1)/2}) &\text{for} &i = 0, 1, ..., n \\ r_i = \sqrt{\pi_i} &\text{for} &i = 0, 1, ..., n \end{array}\end{split}\]
\[\begin{split}K(\theta) = \begin{bmatrix} k_{00} & s_0(r_1/r_0) & s_1(r_2/r_0) & s_2(r_3/r_0) \\ s_0(r_0/r_1) & k_{11} & s_3(r_2/r_1) & s_4(r_3/r_1) \\ s_1(r_0/r_2) & s_3(r_1/r_2) & k_{22} & s_5(r_3/r_2) \\ s_2(r_0/r_3) & s_4(r_1/r_3) & s_5(r_2/r_3) & k_{33} \end{bmatrix}\end{split}\]

where the diagonal elements \(k_{ii}\) are set such that the row sums of \(K\) are all zero, \(k_{ii} = -\sum_{j != i} K_{ij}\). This form for \(K\) satisfies detailed balance by construction

\[K_{ij} / K_{ji} = r_j^2 / r_i^2 = \pi_j / \pi_i\]
\[\pi_i K_{ij} = pi_j K_{ji}\]

Note that \(K\) is built from \(\exp(\theta)\). This parameterization makes optimization easier, since it keeps the off-diagonal entries positive and the diagonal entries negative. The optimization with respect to \(\theta\) without constraints.

Sparse Parameterization

Using the dense parameterization, it is not possible for elements of \(K\) to be exactly zero, because the symmetric rate matrix is parameterized in log-space. Thus, ContinuousTimeMSM also includes an ability to find a sparse rate matrix, through an alternative parameterization. In the sparse parameterization, certain entries in \(\theta\) are fixed at \(-\infty\), such that \(s_u = 0\). The indices of the “active” elements in \(\theta\) are stored in an array of indices, inds, and a compressed representation of theta is used, with only the “active” elements explicitly.

Estimation

ContinuousTimeMSM uses L-BFGS-B to find a maximum likelihood estimate (MLE) rate matrix, \(\hat{\theta}\) and \(K(\hat{\theta})\). By default, the algorithm works first be estimating a fully dense rate matrix. Then, small off-diagonal elements of K are taken as candidates for truncation to zero. A new optimization using the sparse parameterization is performed with these elements constrained. If the log-likelihood of the sparse model is superior to the log likelihood of the dense model, it is retained.

Under the hood (expert functions)

build_ratemat(theta, n, inds, out[, which]) Build the reversible rate matrix K or symmetric rate matrix, S,
dK_dtheta_A(theta, n, u, A[, out]) Compute the sum of the Hadamard (element-wise) product of the derivative of (the rate matrix, K, with respect to the free parameters,`theta`, dK_ij / dtheta_u) and another matrix, A.
loglikelihood(theta, counts, n[, t]) Log likelihood and gradient of the log likelihood of a continuous-time Markov model.
hessian(theta, counts[, t]) Estimate of the hessian of the log-likelihood with respect to theta.
sigma_K(covar_theta, theta, n) Estimate the asymptotic standard deviation (uncertainty in the rate
sigma_pi(covar_theta, theta, n) Estimate the asymptotic standard deviation (uncertainty) in the stationary distribution, pi.
sigma_eigenvalues(covar_theta, theta, n) Estimate the asymptotic standard deviation (uncertainty) in the
sigma_timescales sigma_timescales(covar_theta, theta, n):
eig_K(A, n[, pi, which]) Diagonalize the rate matrix, K, from either the matrix K or the symmetric rate matrix, S.

References

[1]Kalbfleisch, J. D., and Jerald F. Lawless. “The analysis of panel data under a Markov assumption.” J. Am. Stat. Assoc. 80.392 (1985): 863-871.
Versions