This module provides some code to perform bootstrap
of Ward’s hierarchical clustering
This is useful to statistically validate clustering results.
theory see:
basic demo for the ward_msb procedure
in that case the dominant split with 2 clusters should have
dominant p-val
INPUT:
- n,d : the dimensions of the dataset
-niter : the number of bootrstraps
multi-scale bootstrap procedure
INPUT:
- X array of shape (n,p) where
n is the number of items to be clustered
p is their dimensions
- niter=1000
number of iterations of the bootstrap
OUPUT:
- t the resulting tree clustering
the associated subtrees is defined as t.list_of_subtrees()
there are precisely n such subtrees
- cpval: array of shape (n) : the corrected p-value of the clusters
- upval: array of shape (n) : the uncorrected p-value of the clusters