Multi-View K-Means

The MultiViewKMeans class implements a multi-view extension of the traditional K-Means clustering algorithm. It is designed to handle multiple views of the same data, allowing for a more comprehensive clustering approach that leverages the information from all views.

class polyview.cluster.mv_kmeans.MultiViewKMeans(*args: Any, **kwargs: Any)

Bases: BaseMultiViewClusterer

Multi-view K-Means clustering.

Parameters:
  • n_clusters (int, default=2) – Number of clusters.

  • gamma (float, default=2.0) – Controls the distribution of view weights alpha(v).

  • max_iter (int, default=50) – Maximum number of alternating-update iterations.

  • n_init (int, default=10) – Number of random restarts. Best result (lowest objective) kept.

  • tol (float, default=1e-6) – Stop when the relative change in objective falls below this.

  • learn_weights (bool, default=True) – If True, run RMKMC (adaptive view weights). If False, run SMKMC (equal view weights, simpler).

  • eps (float, default=1e-10) – Small constant added to row norms before inversion (D update) to avoid division by zero on zero-residual samples.

  • random_state (int or None, default=None) – Random seed.

labels_

Consensus cluster assignment.

Type:

ndarray of shape (n_samples,)

centroids_

Per-view cluster centroid matrices F(v).

Type:

list of ndarray, shape (n_clusters, n_features_v)

weights_

Learned view importance weights alpha(v). All equal to 1/n_views when learn_weights=False.

Type:

ndarray of shape (n_views,)

objective_

Final value of the objective function.

Type:

float

n_iter_

Number of iterations performed in the best run.

Type:

int

Notes

The gamma parameter controls how the view weights alpha(v) are distributed:
  • gamma -> inf gives equal weights.

  • gamma -> 1 collapses weight onto the single best view.

The paper recommends searching log10(gamma) in [0.1, 2.0].

Setting learn_weights=False recovers the Simple Multi-view K-Means (SMKMC) variant from the same paper, which uses equal view weights throughout.

Examples

>>> import numpy as np
>>> from polyview.cluster.kmeans import MultiviewKMeans
>>> X1 = np.random.rand(100, 4)
>>> X2 = np.random.rand(100, 6)
>>> model = MultiviewKMeans(n_clusters=3, random_state=0)
>>> labels = model.fit_predict([X1, X2])
>>> labels.shape
(100,)

Reference

Cai, X. et al. (2013). Multi-view K-means Clustering on Big Data. IEEE Transactions on Knowledge and Data Engineering.

fit(views: List, y=None) RMKMC

Fit the multi-view k-means model.

Parameters:
  • views (list of array-like of shape (n_samples, n_features_v)) – List of view data arrays.

  • y (None) – Ignored.

Returns:

self – Fitted estimator.

Return type:

object

fit_predict(views: List, y=None) numpy.ndarray

Fit the model and return cluster labels.

Parameters:
  • views (list of array-like) – List of view data arrays.

  • y (None) – Ignored.

Returns:

labels – Cluster labels for each sample.

Return type:

ndarray

predict(views: List) numpy.ndarray

Assign cluster labels to new samples (nearest centroid per view, weighted by learned alpha).

Parameters:

views (list of array-like) – List of view data arrays for prediction.

Returns:

labels – Predicted cluster labels.

Return type:

ndarray of shape (n_samples,)