Preprocessing

Preprocessing of a dataset is a common requirement for many machine learning estimators and may involve scaling, centering, normalization, smoothing, binarization, and imputation methods.

Preprocessors

Binarizer([threshold, copy]) Binarize data (set feature values to 0 or 1) according to a threshold
Butterworth([width, order, analog]) Smooth time-series data using a low-pass, zero-delay Butterworth filter.
EWMA([com, span, halflife, min_periods, ...]) Smooth time-series data using an exponentially-weighted moving average filter
DoubleEWMA([com, span, halflife, ...]) Smooth time-series data using forward and backward exponentially-weighted moving average filters
Imputer([missing_values, strategy, axis, ...]) Imputation transformer for completing missing values.
KernelCenterer Center a kernel matrix
LabelBinarizer([neg_label, pos_label, ...]) Binarize labels in a one-vs-all fashion
MultiLabelBinarizer([classes, sparse_output]) Transform between iterable of iterables and a multilabel format
MinMaxScaler([feature_range, copy]) Transforms features by scaling each feature to a given range.
MaxAbsScaler([copy]) Scale each feature by its maximum absolute value.
Normalizer([norm, copy]) Normalize samples individually to unit norm.
RobustScaler([with_centering, with_scaling, ...]) Scale features using statistics that are robust to outliers.
StandardScaler([copy, with_mean, with_std]) Standardize features by removing the mean and scaling to unit variance
PolynomialFeatures([degree, ...]) Generate polynomial and interaction features.