Preprocessing¶
Preprocessing of a dataset is a common requirement for many machine learning estimators and may involve scaling, centering, normalization, smoothing, binarization, and imputation methods.
Preprocessors¶
Binarizer ([threshold, copy]) |
Binarize data (set feature values to 0 or 1) according to a threshold |
Butterworth ([width, order, analog]) |
Smooth time-series data using a low-pass, zero-delay Butterworth filter. |
EWMA ([com, span, halflife, min_periods, …]) |
Smooth time-series data using an exponentially-weighted moving average filter |
DoubleEWMA ([com, span, halflife, …]) |
Smooth time-series data using forward and backward exponentially-weighted moving average filters |
Imputer ([missing_values, strategy, axis, …]) |
Imputation transformer for completing missing values. |
KernelCenterer |
Center a kernel matrix |
LabelBinarizer ([neg_label, pos_label, …]) |
Binarize labels in a one-vs-all fashion |
MultiLabelBinarizer ([classes, sparse_output]) |
Transform between iterable of iterables and a multilabel format |
MinMaxScaler ([feature_range, copy]) |
Transforms features by scaling each feature to a given range. |
MaxAbsScaler ([copy]) |
Scale each feature by its maximum absolute value. |
Normalizer ([norm, copy]) |
Normalize samples individually to unit norm. |
RobustScaler ([with_centering, with_scaling, …]) |
Scale features using statistics that are robust to outliers. |
StandardScaler ([copy, with_mean, with_std]) |
Standardize features by removing the mean and scaling to unit variance |
PolynomialFeatures ([degree, …]) |
Generate polynomial and interaction features. |