GCCA

Generalised Canonical Correlation Analysis (GCCA) is a multiview extension of CCA that finds a shared embedding across all views. It does this by maximising the sum of pairwise correlations between the views in the shared space, while also applying regularisation to prevent overfitting.

Generalised Canonical Correlation Analysis (GCCA).

Finds a shared low-dimensional embedding G that maximises linear agreement across all views simultaneously (MAXVAR criterion).

Works with M >= 2 views. When M = 2 this recovers classical CCA.

param n_components:

Number of shared dimensions k.

type n_components:

int, default=2

param regularisation:

Ridge regularisation added to each view’s covariance before inversion. A single float applies the same value to all views; a list gives per-view values. Larger values = stronger regularisation (useful when d_v > n or features are collinear).

type regularisation:

float or list of float, default=1e-4

param output:

How to combine per-view projections in transform(): - “concat” : [Z1 | Z2 | … | ZM] shape (n, M*k) - “mean” : (Z1 + Z2 + … + ZM) / M shape (n, k) - “list” : [Z1, Z2, …, ZM] list of (n, k) arrays

type output:

{“concat”, “mean”, “list”}, default=”concat”

param centre:

Subtract column means from each view before fitting.

type centre:

bool, default=True

polyview.embedd.GCCA.G_

Shared embedding of the training data.

Type:

ndarray of shape (n_train, n_components)

polyview.embedd.GCCA.weights_

Per-view projection matrices W(v).

Type:

list of ndarray, shape (n_features_v, n_components)

polyview.embedd.GCCA.means_

Per-view column means (used to centre test data).

Type:

list of ndarray, shape (n_features_v,)

polyview.embedd.GCCA.eigenvalues_

Top-k eigenvalues of the aggregated smoother matrix.

Type:

ndarray of shape (n_components,)

Examples

>>> from polyview.embed.cca import GCCA
>>> gcca = GCCA(n_components=10, output="concat")
>>> Z_train = gcca.fit_transform([X1, X2, X3])
>>> Z_test  = gcca.transform([T1, T2, T3])