msmbuilder.featurizer.ContactFeaturizer¶

class
msmbuilder.featurizer.
ContactFeaturizer
(contacts='all', scheme='closestheavy', ignore_nonprotein=True)¶ Featurizer based on residueresidue distances.
This featurizer transforms a dataset containing MD trajectories into a vector dataset by representing each frame in each of the MD trajectories by a vector of the distances between pairs of aminoacid residues.
The exact method for computing the the distance between two residues is configurable with the
scheme
parameter.Parameters: contacts : np.ndarray or ‘all’
array containing (0indexed) indices of the residues to compute the contacts for. (e.g. np.array([[0, 10], [0, 11]]) would compute the contact between residue 0 and residue 10 as well as the contact between residue 0 and residue 11.) [NOTE: if no array is passed then ‘all’ contacts are calculated. This means that the result will contain all contacts between residues separated by at least 3 residues.]
scheme : {‘ca’, ‘closest’, ‘closestheavy’}
 scheme to determine the distance between two residues:
 ‘ca’ : distance between two residues is given by the distance
between their alpha carbons
 ‘closest’ : distance is the closest distance between any
two atoms in the residues
 ‘closestheavy’ : distance is the closest distance between
any two nonhydrogen atoms in the residues
ignore_nonprotein : bool
When using contact==all, don’t compute contacts between “residues” which are not protein (i.e. do not contain an alpha carbon).
Methods
describe_features
(traj)Return a list of dictionaries describing the contacts features. featurize
(traj)fit
(traj_list[, y])fit_transform
(X[, y])Fit to data, then transform it. get_params
([deep])Get parameters for this estimator. partial_transform
(traj)Featurize an MD trajectory into a vector space derived from set_params
(**params)Set the parameters of this estimator. summarize
()Return some diagnostic summary statistics about this Markov model transform
(traj_list[, y])Featurize a several trajectories. 
__init__
(contacts='all', scheme='closestheavy', ignore_nonprotein=True)¶
Methods
__init__
([contacts, scheme, ignore_nonprotein])describe_features
(traj)Return a list of dictionaries describing the contacts features. featurize
(traj)fit
(traj_list[, y])fit_transform
(X[, y])Fit to data, then transform it. get_params
([deep])Get parameters for this estimator. partial_transform
(traj)Featurize an MD trajectory into a vector space derived from set_params
(**params)Set the parameters of this estimator. summarize
()Return some diagnostic summary statistics about this Markov model transform
(traj_list[, y])Featurize a several trajectories. 
describe_features
(traj)¶ Return a list of dictionaries describing the contacts features.
Parameters: traj : mdtraj.Trajectory
The trajectory to describe
Returns: feature_descs : list of dict
Dictionary describing each feature with the following information about the atoms participating in each dihedral
 resnames: unique names of residues
 atominds: the four atom indicies
 resseqs: unique residue sequence ids (not necessarily 0indexed)
 resids: unique residue ids (0indexed)
 featurizer: Contact
 featuregroup: ca, heavy etc.

fit_transform
(X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: X : numpy array of shape [n_samples, n_features]
Training set.
y : numpy array of shape [n_samples]
Target values.
Returns: X_new : numpy array of shape [n_samples, n_features_new]
Transformed array.

get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep: boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params : mapping of string to any
Parameter names mapped to their values.

partial_transform
(traj)¶ Featurize an MD trajectory into a vector space derived from residueresidue distances
Parameters: traj : mdtraj.Trajectory
A molecular dynamics trajectory to featurize.
Returns: features : np.ndarray, dtype=float, shape=(n_samples, n_features)
A featurized trajectory is a 2D array of shape (length_of_trajectory x n_features) where each features[i] vector is computed by applying the featurization function to the `i`th snapshot of the input trajectory.
See also
transform
 simultaneously featurize a collection of MD trajectories

set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: self

summarize
()¶ Return some diagnostic summary statistics about this Markov model

transform
(traj_list, y=None)¶ Featurize a several trajectories.
Parameters: traj_list : list(mdtraj.Trajectory)
Trajectories to be featurized.
Returns: features : list(np.ndarray), length = len(traj_list)
The featurized trajectories. features[i] is the featurized version of traj_list[i] and has shape (n_samples_i, n_features)