Multi-View K-Means¶

The MultiViewKMeans class implements a multi-view extension of the traditional K-Means clustering algorithm. It is designed to handle multiple views of the same data, allowing for a more comprehensive clustering approach that leverages the information from all views.

class polyview.cluster.mv_kmeans.MultiViewKMeans(*args: Any, **kwargs: Any)¶

Bases: BaseMultiViewClusterer

Multi-view K-Means clustering.

Parameters:

n_clusters (int, default=2) – Number of clusters.
gamma (float, default=2.0) – Controls the distribution of view weights alpha(v).
max_iter (int, default=50) – Maximum number of alternating-update iterations.
n_init (int, default=10) – Number of random restarts. Best result (lowest objective) kept.
tol (float, default=1e-6) – Stop when the relative change in objective falls below this.
learn_weights (bool, default=True) – If True, run RMKMC (adaptive view weights). If False, run SMKMC (equal view weights, simpler).
eps (float, default=1e-10) – Small constant added to row norms before inversion (D update) to avoid division by zero on zero-residual samples.
random_state (int or None, default=None) – Random seed.

labels_¶

Consensus cluster assignment.

Type:: ndarray of shape (n_samples,)

centroids_¶

Per-view cluster centroid matrices F(v).

Type:: list of ndarray, shape (n_clusters, n_features_v)

weights_¶

Learned view importance weights alpha(v). All equal to 1/n_views when learn_weights=False.

Type:: ndarray of shape (n_views,)

objective_¶

Final value of the objective function.

Type:: float

n_iter_¶

Number of iterations performed in the best run.

Type:: int

Notes

The gamma parameter controls how the view weights alpha(v) are distributed:

gamma -> inf gives equal weights.
gamma -> 1 collapses weight onto the single best view.

The paper recommends searching log10(gamma) in [0.1, 2.0].

Setting learn_weights=False recovers the Simple Multi-view K-Means (SMKMC) variant from the same paper, which uses equal view weights throughout.

Examples

>>> import numpy as np
>>> from polyview.cluster.kmeans import MultiviewKMeans
>>> X1 = np.random.rand(100, 4)
>>> X2 = np.random.rand(100, 6)
>>> model = MultiviewKMeans(n_clusters=3, random_state=0)
>>> labels = model.fit_predict([X1, X2])
>>> labels.shape
(100,)

References

Cai, X., Nie, F., Huang, H., & Kamangar, F. (2013). Multi-view K-means clustering on big data. IEEE Transactions on Knowledge and Data Engineering.

fit(views: List, y=None) → RMKMC¶

Fit the multi-view k-means model.

Parameters:

views (list of array-like of shape (n_samples, n_features_v)) – List of view data arrays.
y (None) – Ignored.

Returns:

self – Fitted estimator.

Return type:

object

fit_predict(views: List, y=None) → numpy.ndarray¶

Fit the model and return cluster labels.

Parameters:

views (list of array-like) – List of view data arrays.
y (None) – Ignored.

Returns:

labels – Cluster labels for each sample.

Return type:

ndarray

predict(views: List) → numpy.ndarray¶

Assign cluster labels to new samples (nearest centroid per view, weighted by learned alpha).

Parameters:: views (list of array-like) – List of view data arrays for prediction.
Returns:: labels – Predicted cluster labels.
Return type:: ndarray of shape (n_samples,)