msmbuilder.preprocessing.PolynomialFeatures¶
-
class
msmbuilder.preprocessing.
PolynomialFeatures
(degree=2, interaction_only=False, include_bias=True)¶ Generate polynomial and interaction features.
Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].
Parameters: degree : integer
The degree of the polynomial features. Default = 2.
interaction_only : boolean, default = False
If true, only interaction features are produced: features that are products of at most
degree
distinct input features (so notx[1] ** 2
,x[0] * x[2] ** 3
, etc.).include_bias : boolean
If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model).
Notes
Be aware that the number of features in the output array scales polynomially in the number of features of the input array, and exponentially in the degree. High degrees can cause overfitting.
See examples/linear_model/plot_polynomial_interpolation.py
Examples
>>> X = np.arange(6).reshape(3, 2) >>> X array([[0, 1], [2, 3], [4, 5]]) >>> poly = PolynomialFeatures(2) >>> poly.fit_transform(X) array([[ 1., 0., 1., 0., 0., 1.], [ 1., 2., 3., 4., 6., 9.], [ 1., 4., 5., 16., 20., 25.]]) >>> poly = PolynomialFeatures(interaction_only=True) >>> poly.fit_transform(X) array([[ 1., 0., 1., 0.], [ 1., 2., 3., 6.], [ 1., 4., 5., 20.]])
Attributes
powers_ (array, shape (n_output_features, n_input_features)) powers_[i, j] is the exponent of the jth input in the ith output. n_input_features_ (int) The total number of input features. n_output_features_ (int) The total number of polynomial output features. The number of output features is computed by iterating over all suitably sized combinations of input features. Methods
fit
(sequences[, y])Fit Preprocessing to X. fit_transform
(sequences[, y])Fit the model and apply preprocessing get_feature_names
([input_features])Return feature names for output features get_params
([deep])Get parameters for this estimator. partial_fit
(sequence[, y])Fit Preprocessing to X. partial_transform
(sequence)Apply preprocessing to single sequence set_params
(**params)Set the parameters of this estimator. summarize
()Return some diagnostic summary statistics about this Markov model transform
(sequences)Apply preprocessing to sequences -
__init__
(degree=2, interaction_only=False, include_bias=True)¶
Methods
__init__
([degree, interaction_only, ...])fit
(sequences[, y])Fit Preprocessing to X. fit_transform
(sequences[, y])Fit the model and apply preprocessing get_feature_names
([input_features])Return feature names for output features get_params
([deep])Get parameters for this estimator. partial_fit
(sequence[, y])Fit Preprocessing to X. partial_transform
(sequence)Apply preprocessing to single sequence set_params
(**params)Set the parameters of this estimator. summarize
()Return some diagnostic summary statistics about this Markov model transform
(sequences)Apply preprocessing to sequences Attributes
powers_
-
fit
(sequences, y=None)¶ Fit Preprocessing to X.
Parameters: sequences : list of array-like, each of shape [sequence_length, n_features]
A list of multivariate timeseries. Each sequence may have a different length, but they all must have the same number of features.
y : None
Ignored
Returns: self
-
fit_transform
(sequences, y=None)¶ Fit the model and apply preprocessing
Parameters: sequences: list of array-like, each of shape (n_samples_i, n_features)
Training data, where n_samples_i in the number of samples in sequence i and n_features is the number of features.
y : None
Ignored
Returns: sequence_new : list of array-like, each of shape (n_samples_i, n_components)
-
get_feature_names
(input_features=None)¶ Return feature names for output features
Parameters: input_features : list of string, length n_features, optional
String names for input features if available. By default, “x0”, “x1”, ... “xn_features” is used.
Returns: output_feature_names : list of string, length n_output_features
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params : mapping of string to any
Parameter names mapped to their values.
-
partial_fit
(sequence, y=None)¶ Fit Preprocessing to X.
Parameters: sequence : array-like, [sequence_length, n_features]
A multivariate timeseries.
y : None
Ignored
Returns: self
-
partial_transform
(sequence)¶ Apply preprocessing to single sequence
Parameters: sequence: array like, shape (n_samples, n_features)
A single sequence to transform
Returns: out : array like, shape (n_samples, n_features)
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: self
-
summarize
()¶ Return some diagnostic summary statistics about this Markov model
-
transform
(sequences)¶ Apply preprocessing to sequences
Parameters: sequences: list of array-like, each of shape (n_samples_i, n_features)
Sequence data to transform, where n_samples_i in the number of samples in sequence i and n_features is the number of features.
Returns: sequence_new : list of array-like, each of shape (n_samples_i, n_components)
-