msmbuilder.feature_selection.FeatureSelector

class msmbuilder.feature_selection.FeatureSelector(features, which_feat=None)

Concatenates results of multiple feature extraction objects.

This estimator applies a list of feature_extraction objects then concatenates the results. This is useful to combine several feature extraction mechanisms into a single transformer.

Note: Users should consider using msmbuilder.preprocessing.StandardScaler to normalize their data after combining feature sets.

Parameters:

features : list of (str, msmbuilder.feature_extraction) tuples

List of feature_extraction objects to be applied to the data. The first half of each tuple is the name of the feature_extraction.

which_feat : list or str

Either a string or a list of strings of features to include in the transformer.

Attributes

which_feat

Methods

describe_features(traj) Return a list of dictionaries describing the features.
featurize(traj)
fit(traj_list[, y])
fit_transform(X[, y]) Fit to data, then transform it.
get_params([deep]) Get parameters for this estimator.
partial_transform(traj) Featurize an MD trajectory into a vector space.
set_params(**params) Set the parameters of this estimator.
summarize() Return some diagnostic summary statistics about this Markov model
transform(traj_list[, y]) Featurize a several trajectories.
__init__(features, which_feat=None)

Methods

__init__(features[, which_feat])
describe_features(traj) Return a list of dictionaries describing the features.
featurize(traj)
fit(traj_list[, y])
fit_transform(X[, y]) Fit to data, then transform it.
get_params([deep]) Get parameters for this estimator.
partial_transform(traj) Featurize an MD trajectory into a vector space.
set_params(**params) Set the parameters of this estimator.
summarize() Return some diagnostic summary statistics about this Markov model
transform(traj_list[, y]) Featurize a several trajectories.

Attributes

which_feat
describe_features(traj)

Return a list of dictionaries describing the features. Follows the ordering of featurizers in self.which_feat.

Parameters:

traj : mdtraj.Trajectory

The trajectory to describe

Returns:

feature_descs : list of dict

Dictionary describing each feature with the following information about the atoms participating in each feature

  • resnames: unique names of residues
  • atominds: atom indicies involved in the feature
  • resseqs: unique residue sequence ids (not necessarily 0-indexed)
  • resids: unique residue ids (0-indexed)
  • featurizer: featurizer dependent
  • featuregroup: other info for the featurizer
fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:

X : numpy array of shape [n_samples, n_features]

Training set.

y : numpy array of shape [n_samples]

Target values.

Returns:

X_new : numpy array of shape [n_samples, n_features_new]

Transformed array.

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep: boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params : mapping of string to any

Parameter names mapped to their values.

partial_transform(traj)

Featurize an MD trajectory into a vector space.

Parameters:

traj : mdtraj.Trajectory

A molecular dynamics trajectory to featurize.

Returns:

features : np.ndarray, dtype=float, shape=(n_samples, n_features)

A featurized trajectory is a 2D array of shape (length_of_trajectory x n_features) where each features[i] vector is computed by applying the featurization function to the `i`th snapshot of the input trajectory.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:self
summarize()

Return some diagnostic summary statistics about this Markov model

transform(traj_list, y=None)

Featurize a several trajectories.

Parameters:

traj_list : list(mdtraj.Trajectory)

Trajectories to be featurized.

Returns:

features : list(np.ndarray), length = len(traj_list)

The featurized trajectories. features[i] is the featurized version of traj_list[i] and has shape (n_samples_i, n_features)