msmbuilder.preprocessing.MultiLabelBinarizer¶
-
class
msmbuilder.preprocessing.
MultiLabelBinarizer
(classes=None, sparse_output=False)¶ Transform between iterable of iterables and a multilabel format
Although a list of sets or tuples is a very intuitive format for multilabel data, it is unwieldy to process. This transformer converts between this intuitive format and the supported multilabel format: a (samples x classes) binary matrix indicating the presence of a class label.
Parameters: - classes : array-like of shape [n_classes] (optional)
Indicates an ordering for the class labels
- sparse_output : boolean (default: False),
Set to true if output binary array is desired in CSR sparse format
See also
sklearn.preprocessing.OneHotEncoder
- encode categorical integer features using a one-hot aka one-of-K scheme.
Examples
>>> from sklearn.preprocessing import MultiLabelBinarizer >>> mlb = MultiLabelBinarizer() >>> mlb.fit_transform([(1, 2), (3,)]) array([[1, 1, 0], [0, 0, 1]]) >>> mlb.classes_ array([1, 2, 3])
>>> mlb.fit_transform([set(['sci-fi', 'thriller']), set(['comedy'])]) array([[0, 1, 1], [1, 0, 0]]) >>> list(mlb.classes_) ['comedy', 'sci-fi', 'thriller']
Attributes: - classes_ : array of labels
A copy of the classes parameter where provided, or otherwise, the sorted set of classes found when fitting.
Methods
fit
(X[, y])Fit Preprocessing to X. fit_transform
(sequences[, y])Fit the model and apply preprocessing get_params
([deep])Get parameters for this estimator. inverse_transform
(yt)Transform the given indicator matrix into label sets partial_fit
(sequence[, y])Fit Preprocessing to X. partial_transform
(sequence)Apply preprocessing to single sequence set_params
(**params)Set the parameters of this estimator. summarize
()Return some diagnostic summary statistics about this Markov model transform
(sequences)Apply preprocessing to sequences -
__init__
(classes=None, sparse_output=False)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
([classes, sparse_output])Initialize self. fit
(X[, y])Fit Preprocessing to X. fit_transform
(sequences[, y])Fit the model and apply preprocessing get_params
([deep])Get parameters for this estimator. inverse_transform
(yt)Transform the given indicator matrix into label sets partial_fit
(sequence[, y])Fit Preprocessing to X. partial_transform
(sequence)Apply preprocessing to single sequence set_params
(**params)Set the parameters of this estimator. summarize
()Return some diagnostic summary statistics about this Markov model transform
(sequences)Apply preprocessing to sequences -
fit
(X, y=None)¶ Fit Preprocessing to X.
Parameters: - sequence : array-like, [sequence_length, n_features]
A multivariate timeseries.
- y : None
Ignored
Returns: - self
-
fit_transform
(sequences, y=None)¶ Fit the model and apply preprocessing
Parameters: - sequences: list of array-like, each of shape (n_samples_i, n_features)
Training data, where n_samples_i in the number of samples in sequence i and n_features is the number of features.
- y : None
Ignored
Returns: - sequence_new : list of array-like, each of shape (n_samples_i, n_components)
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: - deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: - params : mapping of string to any
Parameter names mapped to their values.
-
inverse_transform
(yt)¶ Transform the given indicator matrix into label sets
Parameters: - yt : array or sparse matrix of shape (n_samples, n_classes)
A matrix containing only 1s ands 0s.
Returns: - y : list of tuples
The set of labels for each sample such that y[i] consists of classes_[j] for each yt[i, j] == 1.
-
partial_fit
(sequence, y=None)¶ Fit Preprocessing to X. Parameters ———- sequence : array-like, [sequence_length, n_features]
A multivariate timeseries.- y : None
- Ignored
self
-
partial_transform
(sequence)¶ Apply preprocessing to single sequence
Parameters: - sequence: array like, shape (n_samples, n_features)
A single sequence to transform
Returns: - out : array like, shape (n_samples, n_features)
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: - self
-
summarize
()¶ Return some diagnostic summary statistics about this Markov model
-
transform
(sequences)¶ Apply preprocessing to sequences
Parameters: - sequences: list of array-like, each of shape (n_samples_i, n_features)
Sequence data to transform, where n_samples_i in the number of samples in sequence i and n_features is the number of features.
Returns: - sequence_new : list of array-like, each of shape (n_samples_i, n_components)