Markov state models (MSMs)

Markov state models (MSMs) are a class of models for modeling the long-timescale dynamics of molecular systems. They model the dynamics of a system as a series of memoryless, probabilistic jumps between a set of states. Practically, the model consists of (1) a set of conformational states, and (2) a matrix of transition probabilities between each pair of states.

In MSMBuilder, you can use MarkovStateModel to build MSMs from “labeled” trajectories – that is, sequences of integers that are the result of clustering.


MarkovStateModel([lag_time, n_timescales, ...]) Reversible Markov State Model
BayesianMarkovStateModel([lag_time, ...]) Bayesian reversible Markov state model.

Maximum Likelihood and Bayesian Estimation

There are two steps in constructing an MSM

  1. Count the number of observed transitions between states. That is, construct \(\mathbf{C}\) such that \(C_{ij}\) is the number of observed transitions from state \(i\) at time \(t\) to state \(j\) at time \(t+\tau\), summed over all times \(t\).

  2. Estimate the transition probability matrix, \(\mathbf{T}\)

    \[T_{ij} = P( s_{t+\tau} = j | s_t = i)\]

    where \(S = (s_t)\) is a trajectory in state-index space of length \(N\), and \(s_t \in \{1, \ldots, k\}\) the state-index of the trajectory at time \(t\).

The probability that a given transition probability matrix would generate some observed trajectory (the likelihood) is

\[\mathcal{L}(\mathbf{T}) = P(S | \mathbf{T}) = \prod_{t=0}^{N-\tau} T_{s_t, s_{t+\tau}} = \prod_{i,j}^{k} T_{ij}^{C_{ij}}.\]

Assuming a prior distribution on \(T\) of the form \(P(T)=\prod_{ij} T_{ij}^{B_{ij}}\), we then have a posterior distribution

\[P(\mathbf{T} | S) \propto \prod_{i,j}^{k} T_{ij}^{B_{ij} + C_{ij}}.\]

MSMBuilder implements two MSM estimators.

  • MarkovStateModel performs maximum likelihood estimation. It estimates a single transition matrix, \(\mathbf{T}\), to maximimize \(\mathcal{L}(\mathbf{T})\).
  • BayesianMarkovStateModel uses Metropolis Markov chain Monte Carlo to (approximately) draw a sample of transition matrices from the posterior distribution \(P(\mathbf{T} | S)\). This sampler is described in Metzner et al. [5] This can be used to estimate the sampling uncertainty in functions of the transition matrix (e.g. relaxation timescales).


The uncertainty in the transition matrix (and functions of the transition matrix) that can be estimated from BayesianMarkovStateModel do not fully account for all sources of error. In particular, the discretization induced by clustering produces a negative bias on the eigenvalues of the transition matrix – they asymptotically underestimate the eigenvalues of the propagator / transfer operator in the limit of infinite sampling. [6] See section 3D (Quantifying the discretization error) of Prinz et al. for more discussion on the discretization error. [1]

Tradeoffs and Parameter Selection

The most important tradeoff with MSMs is a bias-variance dilemma on the number of states. We know analytically that the expected value of the relaxation timescales is below the true value when using a finite number of states, and that the magnitude of this bias decreases as the number of states goes up. On the other hand, the statistical error in the MSM (variance) goes up as the number of states increases with a fixed data set, because there are fewer transitions (data) per element of the transition probability matrix.

There are no existing algorithms in the MSM literature which fully balance these competing sources of error in an automatic and practical way, although some partially satisfactory algorithms are available. [3] [4]


[1]Prinz, J.-H., et al. Markov models of molecular kinetics: Generation and validation J. Chem. Phys. 134.17 (2011): 174105.
[2]Pande, V. S., K. A. Beauchamp, and G. R. Bowman. Everything you wanted to know about Markov State Models but were afraid to ask Methods 52.1 (2010): 99-105.
[3]McGibbon, R. T., C. R. Schwantes, and Vijay S. Pande. Statistical Model Selection for Markov Models of Biomolecular Dynamics. J. Phys. Chem. B (2014).
[4]Kellogg, E. H., O. F. Lange, and D. Baker. Evaluation and optimization of discrete state models of protein folding. J. Phys. Chem. B 116.37 (2012): 11405-11413.
[5]Metzner, P., F. Noe, and C. Schutte. Estimating the sampling error: Distribution of transition matrices and functions of transition matrices for given trajectory data. Phys. Rev. E 80.2 (2009): 021106.
[6]Nuske, F., et al. Variational approach to molecular kinetics. J. Chem. Theory Comput.10.4 (2014): 1739-1752.