K-Centers clustering
Cluster a vector or Trajectory dataset using a simple heuristic to minimize the maximum distance from any data point to its assigned cluster center.
The runtime of this algorithm is O(kN), where k is the number of clusters and N is the size of the dataset, making it one of the least expensive clustering algorithms available.
Parameters: | n_clusters : int, optional, default: 8
metric : {“euclidean”, “sqeuclidean”, “cityblock”, “chebyshev”, “canberra”,
random_state : integer or numpy.RandomState, optional
|
---|
References
[R7] | Gonzalez, Teofilo F. “Clustering to minimize the maximum intercluster distance.” Theor. Comput. Sci. 38 (1985): 293-306. |
[R8] | Beauchamp, Kyle A., et al. “MSMBuilder2: modeling conformational dynamics on the picosecond to millisecond scale.” J. Chem. Theory. Comput. 7.10 (2011): 3412-3419. |
Attributes
cluster_centers_ | (array, [n_clusters, n_features]) Coordinates of cluster centers |
labels_ | (list of arrays, each of shape [sequence_length, ]) labels_[i] is an array of the labels of each point in sequence i. The label of each point is an integer in [0, n_clusters). |
distances_ | (list of arrays, each of shape [sequence_length, ]) distances_[i] is an array of the labels of each point in sequence i. Distance from each sample to the cluster center it is assigned to. |
Methods
fit(sequences[, y]) | Fit the kcenters clustering on the data |
fit_predict(sequences[, y]) | Performs clustering on X and returns cluster labels. |
fit_transform(sequences[, y]) | Alias for fit_predict |
get_params([deep]) | Get parameters for this estimator. |
partial_predict(X[, y]) | Predict the closest cluster each sample in X belongs to. |
partial_transform(X) | Alias for partial_predict |
predict(sequences[, y]) | Predict the closest cluster each sample in each sequence in sequences belongs to. |
set_params(**params) | Set the parameters of this estimator. |
summarize() | |
transform(sequences) | Alias for predict |
Fit the kcenters clustering on the data
Parameters: | sequences : list of array-like, each of shape [sequence_length, n_features]
|
---|---|
Returns: | self |
Performs clustering on X and returns cluster labels.
Parameters: | sequences : list of array-like, each of shape [sequence_length, n_features]
|
---|---|
Returns: | Y : list of ndarray, each of shape [sequence_length, ]
|
Alias for fit_predict
Get parameters for this estimator.
Parameters: | deep: boolean, optional
|
---|---|
Returns: | params : mapping of string to any
|
Predict the closest cluster each sample in X belongs to.
In the vector quantization literature, cluster_centers_ is called the code book and each value returned by predict is the index of the closest code in the code book.
Parameters: | X : array-like shape=(n_samples, n_features)
|
---|---|
Returns: | Y : array, shape=(n_samples,)
|
Alias for partial_predict
Predict the closest cluster each sample in each sequence in sequences belongs to.
In the vector quantization literature, cluster_centers_ is called the code book and each value returned by predict is the index of the closest code in the code book.
Parameters: | sequences : list of array-like, each of shape [sequence_length, n_features]
|
---|---|
Returns: | Y : list of arrays, each of shape [sequence_length,]
|
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Returns: | self |
---|
Alias for predict