CKMeans Class Reference


Detailed Description

KMeans clustering, partitions the data into k (a-priori specified) clusters.

It minimizes

\[ \sum_{i=1}^k\sum_{x_j\in S_i} (x_j-\mu_i)^2 \]

where $\mu_i$ are the cluster centers and $S_i,\;i=1,\dots,k$ are the index sets of the clusters.

Beware that this algorithm obtains only a local optimum.

cf. http://en.wikipedia.org/wiki/K-means_algorithm

Definition at line 39 of file KMeans.h.

Inheritance diagram for CKMeans:
Inheritance graph
[legend]

List of all members.

Public Member Functions

 CKMeans ()
 CKMeans (int32_t k, CDistance *d)
virtual ~CKMeans ()
virtual EClassifierType get_classifier_type ()
virtual bool train (CFeatures *data=NULL)
virtual bool load (FILE *srcfile)
virtual bool save (FILE *dstfile)
void set_k (int32_t p_k)
int32_t get_k ()
void set_max_iter (int32_t iter)
float64_t get_max_iter ()
void get_radi (float64_t *&radi, int32_t &num)
void get_centers (float64_t *&centers, int32_t &dim, int32_t &num)
void get_radiuses (float64_t **radii, int32_t *num)
void get_cluster_centers (float64_t **centers, int32_t *dim, int32_t *num)
int32_t get_dimensions ()

Protected Member Functions

void sqdist (float64_t *x, CSimpleFeatures< float64_t > *y, float64_t *z, int32_t n1, int32_t offs, int32_t n2, int32_t m)
void clustknb (bool use_old_mus, float64_t *mus_start)
virtual CLabelsclassify ()
virtual CLabelsclassify (CFeatures *data)
virtual const char * get_name () const

Protected Attributes

int32_t max_iter
 maximum number of iterations
int32_t k
 the k parameter in KMeans
int32_t dimensions
 number of dimensions
float64_tR
 radi of the clusters (size k)
float64_tmus
 centers of the clusters (size dimensions x k)

Constructor & Destructor Documentation

CKMeans (  ) 

default constructor

Definition at line 29 of file KMeans.cpp.

CKMeans ( int32_t  k,
CDistance d 
)

constructor

Parameters:
k parameter k
d distance

Definition at line 35 of file KMeans.cpp.

~CKMeans (  )  [virtual]

Definition at line 42 of file KMeans.cpp.


Member Function Documentation

virtual CLabels* classify ( CFeatures data  )  [protected, virtual]

classify objects

Parameters:
data (test)data to be classified
Returns:
classified labels

Implements CClassifier.

Definition at line 224 of file KMeans.h.

virtual CLabels* classify (  )  [protected, virtual]

classify objects using the currently set features

Returns:
classified labels

Implements CClassifier.

Definition at line 213 of file KMeans.h.

void clustknb ( bool  use_old_mus,
float64_t mus_start 
) [protected]

clustknb

Parameters:
use_old_mus if old mus shall be used
mus_start mus start

replace rhs feature vectors

set rhs to mus_start

update rhs

sqdist(mus, lhs, dists, k, Pat, 1, dimensions);

Definition at line 177 of file KMeans.cpp.

void get_centers ( float64_t *&  centers,
int32_t &  dim,
int32_t &  num 
)

get centers

Parameters:
centers current centers are stored in here
dim dimensions are stored in here
num number of centers is stored in here

Definition at line 138 of file KMeans.h.

virtual EClassifierType get_classifier_type (  )  [virtual]

get classifier type

Returns:
classifier type KMEANS

Reimplemented from CClassifier.

Definition at line 57 of file KMeans.h.

void get_cluster_centers ( float64_t **  centers,
int32_t *  dim,
int32_t *  num 
)

get cluster centers (swig compatible)

Parameters:
centers current cluster centers are stored in here
dim dimensions are stored in here
num number of centers is stored in here

Definition at line 166 of file KMeans.h.

int32_t get_dimensions (  ) 

get dimensions

Returns:
number of dimensions

Definition at line 182 of file KMeans.h.

int32_t get_k (  ) 

get k

Returns:
the parameter k

Definition at line 97 of file KMeans.h.

float64_t get_max_iter (  ) 

get maximum number of iterations

Returns:
maximum number of iterations

Definition at line 116 of file KMeans.h.

virtual const char* get_name (  )  const [protected, virtual]
Returns:
object name

Implements CSGObject.

Definition at line 233 of file KMeans.h.

void get_radi ( float64_t *&  radi,
int32_t &  num 
)

get radi

Parameters:
radi current radi are stored in here
num number of radi is stored in here

Definition at line 126 of file KMeans.h.

void get_radiuses ( float64_t **  radii,
int32_t *  num 
)

get radiuses (swig compatible)

Parameters:
radii current radiuses are stored in here
num number of radiuses is stored in here

Definition at line 150 of file KMeans.h.

bool load ( FILE *  srcfile  )  [virtual]

load distance machine from file

Parameters:
srcfile file to load from
Returns:
if loading was successful

Reimplemented from CClassifier.

Definition at line 72 of file KMeans.cpp.

bool save ( FILE *  dstfile  )  [virtual]

save distance machine to file

Parameters:
dstfile file to save to
Returns:
if saving was successful

Reimplemented from CClassifier.

Definition at line 77 of file KMeans.cpp.

void set_k ( int32_t  p_k  ) 

set k

Parameters:
p_k new k

Definition at line 87 of file KMeans.h.

void set_max_iter ( int32_t  iter  ) 

set maximum number of iterations

Parameters:
iter the new maximum

Definition at line 106 of file KMeans.h.

void sqdist ( float64_t x,
CSimpleFeatures< float64_t > *  y,
float64_t z,
int32_t  n1,
int32_t  offs,
int32_t  n2,
int32_t  m 
) [protected]

sqdist

Parameters:
x x
y y
z z
n1 n1
offs offset
n2 n2
m m

Definition at line 129 of file KMeans.cpp.

bool train ( CFeatures data = NULL  )  [virtual]

train k-means

Parameters:
data training data (parameter can be avoided if distance or kernel-based classifiers are used and distance/kernels are initialized with train data)
Returns:
whether training was successful

Reimplemented from CClassifier.

Definition at line 48 of file KMeans.cpp.


Member Data Documentation

int32_t dimensions [protected]

number of dimensions

Definition at line 243 of file KMeans.h.

int32_t k [protected]

the k parameter in KMeans

Definition at line 240 of file KMeans.h.

int32_t max_iter [protected]

maximum number of iterations

Definition at line 237 of file KMeans.h.

float64_t* mus [protected]

centers of the clusters (size dimensions x k)

Definition at line 249 of file KMeans.h.

float64_t* R [protected]

radi of the clusters (size k)

Definition at line 246 of file KMeans.h.


The documentation for this class was generated from the following files:

SHOGUN Machine Learning Toolbox - Documentation