SHOGUN  v1.1.0
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
List of all members | Public Member Functions | Protected Member Functions | Static Protected Member Functions | Protected Attributes
COligoStringKernel Class Reference

Detailed Description

This class offers access to the Oligo Kernel introduced by Meinicke et al. in 2004.

The class has functions to preprocess the data such that the kernel computation can be pursued faster. The kernel function is then kernelOligoFast or kernelOligo.

Requires significant speedup, should be working but as is might be applicable only to academic small scale problems:

Uses CSqrtDiagKernelNormalizer, as the vanilla kernel seems to be very diagonally dominant.

Definition at line 41 of file OligoStringKernel.h.

Inheritance diagram for COligoStringKernel:
Inheritance graph
[legend]

Public Member Functions

 COligoStringKernel ()
 COligoStringKernel (int32_t cache_size, int32_t k, float64_t width)
virtual ~COligoStringKernel ()
virtual bool init (CFeatures *l, CFeatures *r)
virtual EKernelType get_kernel_type ()
virtual const char * get_name () const
virtual float64_t compute (int32_t x, int32_t y)
virtual void cleanup ()
- Public Member Functions inherited from CStringKernel< char >
 CStringKernel (int32_t cachesize=0)
 CStringKernel (CFeatures *l, CFeatures *r)
virtual EFeatureClass get_feature_class ()
virtual EFeatureType get_feature_type ()
- Public Member Functions inherited from CKernel
 CKernel ()
 CKernel (int32_t size)
 CKernel (CFeatures *l, CFeatures *r, int32_t size)
virtual ~CKernel ()
float64_t kernel (int32_t idx_a, int32_t idx_b)
SGMatrix< float64_tget_kernel_matrix ()
virtual SGVector< float64_tget_kernel_col (int32_t j)
virtual SGVector< float64_tget_kernel_row (int32_t i)
template<class T >
SGMatrix< T > get_kernel_matrix ()
virtual bool set_normalizer (CKernelNormalizer *normalizer)
virtual CKernelNormalizerget_normalizer ()
virtual bool init_normalizer ()
void load (CFile *loader)
void save (CFile *writer)
CFeaturesget_lhs ()
CFeaturesget_rhs ()
virtual int32_t get_num_vec_lhs ()
virtual int32_t get_num_vec_rhs ()
virtual bool has_features ()
bool get_lhs_equals_rhs ()
virtual void remove_lhs_and_rhs ()
virtual void remove_lhs ()
virtual void remove_rhs ()
 takes all necessary steps if the rhs is removed from kernel
void set_cache_size (int32_t size)
int32_t get_cache_size ()
void list_kernel ()
bool has_property (EKernelProperty p)
virtual void clear_normal ()
virtual void add_to_normal (int32_t vector_idx, float64_t weight)
EOptimizationType get_optimization_type ()
virtual void set_optimization_type (EOptimizationType t)
bool get_is_initialized ()
virtual bool init_optimization (int32_t count, int32_t *IDX, float64_t *weights)
virtual bool delete_optimization ()
bool init_optimization_svm (CSVM *svm)
virtual float64_t compute_optimized (int32_t vector_idx)
virtual void compute_batch (int32_t num_vec, int32_t *vec_idx, float64_t *target, int32_t num_suppvec, int32_t *IDX, float64_t *alphas, float64_t factor=1.0)
float64_t get_combined_kernel_weight ()
void set_combined_kernel_weight (float64_t nw)
virtual int32_t get_num_subkernels ()
virtual void compute_by_subkernel (int32_t vector_idx, float64_t *subkernel_contrib)
virtual const float64_tget_subkernel_weights (int32_t &num_weights)
virtual void set_subkernel_weights (SGVector< float64_t > weights)
- Public Member Functions inherited from CSGObject
 CSGObject ()
 CSGObject (const CSGObject &orig)
virtual ~CSGObject ()
virtual bool is_generic (EPrimitiveType *generic) const
template<class T >
void set_generic ()
void unset_generic ()
virtual void print_serializable (const char *prefix="")
virtual bool save_serializable (CSerializableFile *file, const char *prefix="")
virtual bool load_serializable (CSerializableFile *file, const char *prefix="")
void set_global_io (SGIO *io)
SGIOget_global_io ()
void set_global_parallel (Parallel *parallel)
Parallelget_global_parallel ()
void set_global_version (Version *version)
Versionget_global_version ()
SGVector< char * > get_modelsel_names ()
char * get_modsel_param_descr (const char *param_name)
index_t get_modsel_param_index (const char *param_name)

Protected Member Functions

float64_t kernelOligoFast (const std::vector< std::pair< int32_t, float64_t > > &x, const std::vector< std::pair< int32_t, float64_t > > &y, int32_t max_distance=-1)
 returns the value of the oligo kernel for sequences 'x' and 'y'

Static Protected Member Functions

static void encodeOligo (const std::string &sequence, uint32_t k_mer_length, const std::string &allowed_characters, std::vector< std::pair< int32_t, float64_t > > &values)
 encodes the signals of the sequence
static void getSequences (const std::vector< std::string > &sequences, uint32_t k_mer_length, const std::string &allowed_characters, std::vector< std::vector< std::pair< int32_t, float64_t > > > &encoded_sequences)
 encodes all sequences with the encodeOligo function and stores them in 'encoded_sequences'

Protected Attributes

int32_t k
float64_t width
float64_tgauss_table
int32_t gauss_table_len

Additional Inherited Members

- Public Attributes inherited from CSGObject
SGIOio
Parallelparallel
Versionversion
Parameterm_parameters
Parameterm_model_selection_parameters

Constructor & Destructor Documentation

default constructor

Definition at line 24 of file OligoStringKernel.cpp.

COligoStringKernel ( int32_t  cache_size,
int32_t  k,
float64_t  width 
)

Constructor

Parameters
cache_sizecache size for kernel
kk-mer length
width- equivalent to 2*sigma^2

Definition at line 30 of file OligoStringKernel.cpp.

~COligoStringKernel ( )
virtual

Destructor

Definition at line 39 of file OligoStringKernel.cpp.

Member Function Documentation

void cleanup ( )
virtual

clean up your kernel

Reimplemented from CKernel.

Definition at line 44 of file OligoStringKernel.cpp.

float64_t compute ( int32_t  x,
int32_t  y 
)
virtual

compute kernel function for features a and b idx_{a,b} denote the index of the feature vectors in the corresponding feature object

abstract base method

Parameters
xindex a
yindex b
Returns
computed kernel function at indices a,b

Implements CKernel.

Definition at line 233 of file OligoStringKernel.cpp.

void encodeOligo ( const std::string &  sequence,
uint32_t  k_mer_length,
const std::string &  allowed_characters,
std::vector< std::pair< int32_t, float64_t > > &  values 
)
staticprotected

encodes the signals of the sequence

This function stores the oligo function signals in 'values'.

The 'k_mer_length' and the 'allowed_characters' determine, which signals are used. Every pair contains the position of the signal and a numerical value reflecting the signal. The numerical value represents the k_mer to a base n = |allowed_characters|. Example: The value of k_mer CG for the allowed characters ACGT would be 1 * n^1 + 2 * n^0 = 6.

Definition at line 67 of file OligoStringKernel.cpp.

virtual EKernelType get_kernel_type ( )
virtual

return what type of kernel we are

Returns
kernel type OLIGO

Implements CStringKernel< char >.

Definition at line 69 of file OligoStringKernel.h.

virtual const char* get_name ( ) const
virtual

return the kernel's name

Returns
name Oligo

Reimplemented from CStringKernel< char >.

Definition at line 75 of file OligoStringKernel.h.

void getSequences ( const std::vector< std::string > &  sequences,
uint32_t  k_mer_length,
const std::string &  allowed_characters,
std::vector< std::vector< std::pair< int32_t, float64_t > > > &  encoded_sequences 
)
staticprotected

encodes all sequences with the encodeOligo function and stores them in 'encoded_sequences'

This function encodes the sequences of 'sequences' via the function encodeOligo.

Definition at line 125 of file OligoStringKernel.cpp.

bool init ( CFeatures l,
CFeatures r 
)
virtual

initialize kernel

Parameters
lfeatures of left-hand side
rfeatures of right-hand side
Returns
if initializing was successful

Reimplemented from CStringKernel< char >.

Definition at line 53 of file OligoStringKernel.cpp.

float64_t kernelOligoFast ( const std::vector< std::pair< int32_t, float64_t > > &  x,
const std::vector< std::pair< int32_t, float64_t > > &  y,
int32_t  max_distance = -1 
)
protected

returns the value of the oligo kernel for sequences 'x' and 'y'

This function computes the kernel value of the oligo kernel, which was introduced by Meinicke et al. in 2004. 'x' and 'y' are encoded by encodeOligo and 'exp_cache' has to be constructed by getExpFunctionCache.

'max_distance' can be used to speed up the computation even further by restricting the maximum distance between a k_mer at position i in sequence 'x' and a k_mer at position j in sequence 'y'. If i - j > 'max_distance' the value is not added to the kernel value. This approximation is switched off by default (max_distance < 0).

Definition at line 153 of file OligoStringKernel.cpp.

Member Data Documentation

float64_t* gauss_table
protected

cache for exp (see getExpFunctionCache above)

Definition at line 162 of file OligoStringKernel.h.

int32_t gauss_table_len
protected

length of gauss table

Definition at line 164 of file OligoStringKernel.h.

int32_t k
protected

k-mer length

Definition at line 158 of file OligoStringKernel.h.

float64_t width
protected

width of kernel

Definition at line 160 of file OligoStringKernel.h.


The documentation for this class was generated from the following files:

SHOGUN Machine Learning Toolbox - Documentation