Multi-View K-Means¶
The MultiViewKMeans class implements a multi-view extension of the traditional K-Means clustering algorithm. It is designed to handle multiple views of the same data, allowing for a more comprehensive clustering approach that leverages the information from all views.
- class polyview.cluster.mv_kmeans.MultiViewKMeans(*args: Any, **kwargs: Any)¶
Bases:
BaseMultiViewClustererMulti-view K-Means clustering.
- Parameters:
n_clusters (int, default=2) – Number of clusters.
gamma (float, default=2.0) – Controls the distribution of view weights alpha(v).
max_iter (int, default=50) – Maximum number of alternating-update iterations.
n_init (int, default=10) – Number of random restarts. Best result (lowest objective) kept.
tol (float, default=1e-6) – Stop when the relative change in objective falls below this.
learn_weights (bool, default=True) – If True, run RMKMC (adaptive view weights). If False, run SMKMC (equal view weights, simpler).
eps (float, default=1e-10) – Small constant added to row norms before inversion (D update) to avoid division by zero on zero-residual samples.
random_state (int or None, default=None) – Random seed.
- labels_¶
Consensus cluster assignment.
- Type:
ndarray of shape (n_samples,)
- centroids_¶
Per-view cluster centroid matrices F(v).
- Type:
list of ndarray, shape (n_clusters, n_features_v)
- weights_¶
Learned view importance weights alpha(v). All equal to 1/n_views when learn_weights=False.
- Type:
ndarray of shape (n_views,)
- objective_¶
Final value of the objective function.
- Type:
float
- n_iter_¶
Number of iterations performed in the best run.
- Type:
int
Notes
- The gamma parameter controls how the view weights alpha(v) are distributed:
gamma -> inf gives equal weights.
gamma -> 1 collapses weight onto the single best view.
The paper recommends searching log10(gamma) in [0.1, 2.0].
Setting
learn_weights=Falserecovers the Simple Multi-view K-Means (SMKMC) variant from the same paper, which uses equal view weights throughout.Examples
>>> import numpy as np >>> from polyview.cluster.kmeans import MultiviewKMeans >>> X1 = np.random.rand(100, 4) >>> X2 = np.random.rand(100, 6) >>> model = MultiviewKMeans(n_clusters=3, random_state=0) >>> labels = model.fit_predict([X1, X2]) >>> labels.shape (100,)
Reference¶
Cai, X. et al. (2013). Multi-view K-means Clustering on Big Data. IEEE Transactions on Knowledge and Data Engineering.
- fit(views: List, y=None) RMKMC¶
Fit the multi-view k-means model.
- Parameters:
views (list of array-like of shape (n_samples, n_features_v)) – List of view data arrays.
y (None) – Ignored.
- Returns:
self – Fitted estimator.
- Return type:
object
- fit_predict(views: List, y=None) numpy.ndarray¶
Fit the model and return cluster labels.
- Parameters:
views (list of array-like) – List of view data arrays.
y (None) – Ignored.
- Returns:
labels – Cluster labels for each sample.
- Return type:
ndarray
- predict(views: List) numpy.ndarray¶
Assign cluster labels to new samples (nearest centroid per view, weighted by learned alpha).
- Parameters:
views (list of array-like) – List of view data arrays for prediction.
- Returns:
labels – Predicted cluster labels.
- Return type:
ndarray of shape (n_samples,)