msmbuilder.feature_selection.VarianceThreshold¶
-
class
msmbuilder.feature_selection.
VarianceThreshold
(threshold=0.0)¶ Feature selector that removes all low-variance features.
This feature selection algorithm looks only at the features (X), not the desired outputs (y), and can thus be used for unsupervised learning.
Read more in the User Guide.
Parameters: - threshold : float, optional
Features with a training-set variance lower than this threshold will be removed. The default is to keep all features with non-zero variance, i.e. remove the features that have the same value in all samples.
Examples
The following dataset has integer features, two of which are the same in every sample. These are removed with the default setting for threshold:
>>> X = [[0, 2, 0, 3], [0, 1, 4, 3], [0, 1, 1, 3]] >>> selector = VarianceThreshold() >>> selector.fit_transform(X) array([[2, 0], [1, 4], [1, 1]])
Attributes: - variances_ : array, shape (n_features,)
Variances of individual features.
Methods
fit
(sequences[, y])Fit the model fit_transform
(sequences[, y])Fit the model and apply dimensionality reduction get_params
([deep])Get parameters for this estimator. get_support
([indices])Get a mask, or integer index, of the features selected inverse_transform
(X)Reverse the transformation operation partial_transform
(sequence)Apply dimensionality reduction to single sequence set_params
(**params)Set the parameters of this estimator. summarize
()Return some diagnostic summary statistics about this Markov model transform
(sequences)Apply dimensionality reduction to sequences -
__init__
(threshold=0.0)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
([threshold])Initialize self. fit
(sequences[, y])Fit the model fit_transform
(sequences[, y])Fit the model and apply dimensionality reduction get_params
([deep])Get parameters for this estimator. get_support
([indices])Get a mask, or integer index, of the features selected inverse_transform
(X)Reverse the transformation operation partial_transform
(sequence)Apply dimensionality reduction to single sequence set_params
(**params)Set the parameters of this estimator. summarize
()Return some diagnostic summary statistics about this Markov model transform
(sequences)Apply dimensionality reduction to sequences -
fit
(sequences, y=None)¶ Fit the model
Parameters: - sequences : list of array-like, each of shape [sequence_length, n_features]
A list of multivariate timeseries. Each sequence may have a different length, but they all must have the same number of features.
- y : None
Ignored
Returns: - self
-
fit_transform
(sequences, y=None)¶ Fit the model and apply dimensionality reduction
Parameters: - sequences: list of array-like, each of shape (n_samples_i, n_features)
Training data, where n_samples_i in the number of samples in sequence i and n_features is the number of features.
- y : None
Ignored
Returns: - sequence_new : list of array-like, each of shape (n_samples_i, n_components)
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: - deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: - params : mapping of string to any
Parameter names mapped to their values.
-
get_support
(indices=False)¶ Get a mask, or integer index, of the features selected
Parameters: - indices : boolean (default False)
If True, the return value will be an array of integers, rather than a boolean mask.
Returns: - support : array
An index that selects the retained features from a feature vector. If indices is False, this is a boolean array of shape [# input features], in which an element is True iff its corresponding feature is selected for retention. If indices is True, this is an integer array of shape [# output features] whose values are indices into the input feature vector.
-
inverse_transform
(X)¶ Reverse the transformation operation
Parameters: - X : array of shape [n_samples, n_selected_features]
The input samples.
Returns: - X_r : array of shape [n_samples, n_original_features]
X with columns of zeros inserted where features would have been removed by transform.
-
partial_transform
(sequence)¶ Apply dimensionality reduction to single sequence
Parameters: - sequence: array like, shape (n_samples, n_features)
A single sequence to transform
Returns: - out : array like, shape (n_samples, n_features)
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: - self
-
summarize
()¶ Return some diagnostic summary statistics about this Markov model
-
transform
(sequences)¶ Apply dimensionality reduction to sequences
Parameters: - sequences: list of array-like, each of shape (n_samples_i, n_features)
Sequence data to transform, where n_samples_i in the number of samples in sequence i and n_features is the number of features.
Returns: - sequence_new : list of array-like, each of shape (n_samples_i, n_components)