8.1.2.7. sklearn.cluster.spectral_clustering¶
- sklearn.cluster.spectral_clustering(affinity, k=8, n_components=None, mode=None, random_state=None, n_init=10)¶
Apply k-means to a projection to the normalized laplacian
In practice Spectral Clustering is very useful when the structure of the individual clusters is highly non-convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance when clusters are nested circles on the 2D plan.
If affinity is the adjacency matrix of a graph, this method can be used to find normalized graph cuts.
Parameters : affinity: array-like or sparse matrix, shape: (n_samples, n_samples) :
The affinity matrix describing the relationship of the samples to embed. Must be symetric.
- Possible examples:
- adjacency matrix of a graph,
- heat kernel of the pairwise distance matrix of the samples,
- symmetic k-nearest neighbours connectivity matrix of the samples.
k: integer, optional :
Number of clusters to extract.
n_components: integer, optional, default is k :
Number of eigen vectors to use for the spectral embedding
mode: {None, ‘arpack’ or ‘amg’} :
The eigenvalue decomposition strategy to use. AMG requires pyamg to be installed. It can be faster on very large, sparse problems, but may also lead to instabilities
random_state: int seed, RandomState instance, or None (default) :
A pseudo random number generator used for the initialization of the lobpcg eigen vectors decomposition when mode == ‘amg’ and by the K-Means initialization.
n_init: int, optional, default: 10 :
Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.
Returns : labels: array of integers, shape: n_samples :
The labels of the clusters.
centers: array of integers, shape: k :
The indices of the cluster centers
Notes
The graph should contain only one connect component, elsewhere the results make little sense.
This algorithm solves the normalized cut for k=2: it is a normalized spectral clustering.
References
- Normalized cuts and image segmentation, 2000 Jianbo Shi, Jitendra Malik http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.2324
- A Tutorial on Spectral Clustering, 2007 Ulrike von Luxburg http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.165.9323