SHOGUN v0.9.0
公有成员 | 静态公有成员 | 静态公有属性 | 保护成员 | 保护属性
CAlphabet类参考

详细描述

The class Alphabet implements an alphabet and alphabet utility functions.

These utility functions can be used to remap characters to more (bit-)efficient representations, check if a string is valid, compute histograms etc.

Currently supported alphabets are DNA, RAWDNA, RNA, PROTEIN, BINARY, ALPHANUM, CUBE, RAW, IUPAC_NUCLEIC_ACID and IUPAC_AMINO_ACID.

在文件Alphabet.h89行定义。

继承图,类CAlphabet
Inheritance graph
[图例]

所有成员的列表。

公有成员

 CAlphabet ()
 CAlphabet (char *alpha, int32_t len)
 CAlphabet (EAlphabet alpha)
 CAlphabet (CAlphabet *alpha)
virtual ~CAlphabet ()
bool set_alphabet (EAlphabet alpha)
EAlphabet get_alphabet () const
int32_t get_num_symbols () const
int32_t get_num_bits () const
uint8_t remap_to_bin (uint8_t c)
uint8_t remap_to_char (uint8_t c)
void clear_histogram ()
 clear histogram
template<class T >
void add_string_to_histogram (T *p, int64_t len)
void add_byte_to_histogram (uint8_t p)
void print_histogram ()
 print histogram
void get_hist (int64_t **h, int32_t *len)
const int64_t * get_histogram ()
 get pointer to histogram
bool check_alphabet (bool print_error=true)
bool is_valid (uint8_t c)
bool check_alphabet_size (bool print_error=true)
int32_t get_num_symbols_in_histogram ()
int32_t get_max_value_in_histogram ()
int32_t get_num_bits_in_histogram ()
virtual const char * get_name () const

静态公有成员

static const char * get_alphabet_name (EAlphabet alphabet)
template<class ST >
static void translate_from_single_order (ST *obs, int32_t sequence_length, int32_t start, int32_t p_order, int32_t max_val)
template<class ST >
static void translate_from_single_order_reversed (ST *obs, int32_t sequence_length, int32_t start, int32_t p_order, int32_t max_val)
template<class ST >
static void translate_from_single_order (ST *obs, int32_t sequence_length, int32_t start, int32_t p_order, int32_t max_val, int32_t gap)
template<class ST >
static void translate_from_single_order_reversed (ST *obs, int32_t sequence_length, int32_t start, int32_t p_order, int32_t max_val, int32_t gap)

静态公有属性

static const uint8_t B_A = 0
static const uint8_t B_C = 1
static const uint8_t B_G = 2
static const uint8_t B_T = 3
static const uint8_t B_0 = 4
static const uint8_t MAPTABLE_UNDEF = 0xff
static const char * alphabet_names [18]

保护成员

void init_map_table ()
void copy_histogram (CAlphabet *src)
virtual void load_serializable_post (void) throw (ShogunException)

保护属性

EAlphabet alphabet
int32_t num_symbols
int32_t num_bits
bool valid_chars [1<< (sizeof(uint8_t)*8)]
uint8_t maptable_to_bin [1<< (sizeof(uint8_t)*8)]
uint8_t maptable_to_char [1<< (sizeof(uint8_t)*8)]
int64_t histogram [1<< (sizeof(uint8_t)*8)]

构造及析构函数文档

CAlphabet ( )

default constructor

在文件Alphabet.cpp34行定义。

CAlphabet ( char *  alpha,
int32_t  len 
)

constructor

参数:
alphaalphabet to use
lenlen

在文件Alphabet.cpp40行定义。

CAlphabet ( EAlphabet  alpha)

constructor

参数:
alphaalphabet (type) to use

在文件Alphabet.cpp87行定义。

CAlphabet ( CAlphabet alpha)

constructor

参数:
alphaalphabet to use

在文件Alphabet.cpp94行定义。

~CAlphabet ( ) [virtual]

在文件Alphabet.cpp103行定义。


成员函数文档

void add_byte_to_histogram ( uint8_t  p)

add element to histogram

参数:
pelement

在文件Alphabet.h191行定义。

void add_string_to_histogram ( T *  p,
int64_t  len 
)

make histogram for whole string

参数:
pstring
lenlength of string

在文件Alphabet.h181行定义。

bool check_alphabet ( bool  print_error = true)

check whether symbols in histogram are valid in alphabet e.g. for DNA if only letters ACGT appear

参数:
print_errorif errors shall be printed
返回:
if symbols in histogram are valid in alphabet

在文件Alphabet.cpp594行定义。

bool check_alphabet_size ( bool  print_error = true)

check whether symbols in histogram ALL fit in alphabet

参数:
print_errorif errors shall be printed
返回:
if symbols in histogram ALL fit in alphabet

在文件Alphabet.cpp616行定义。

void clear_histogram ( )

clear histogram

在文件Alphabet.cpp543行定义。

void copy_histogram ( CAlphabet src) [protected]

copy histogram

参数:
srcalphabet to copy histogram from

在文件Alphabet.cpp633行定义。

EAlphabet get_alphabet ( ) const

get alphabet

返回:
alphabet

在文件Alphabet.h128行定义。

const char * get_alphabet_name ( EAlphabet  alphabet) [static]

return alphabet name

参数:
alphabetalphabet type to get name from

在文件Alphabet.cpp638行定义。

void get_hist ( int64_t **  h,
int32_t *  len 
)

get histogram

参数:
hwhere the histogram will be stored
lenlength of histogram

在文件Alphabet.h204行定义。

const int64_t* get_histogram ( )

get pointer to histogram

在文件Alphabet.h216行定义。

int32_t get_max_value_in_histogram ( )

return maximum value in histogram

返回:
maximum value in histogram

在文件Alphabet.cpp549行定义。

virtual const char* get_name ( void  ) const [virtual]
返回:
object name

实现了CSGObject

在文件Alphabet.h275行定义。

int32_t get_num_bits ( ) const

get number of bits necessary to store all symbols in alphabet

返回:
number of necessary storage bits

在文件Alphabet.h147行定义。

int32_t get_num_bits_in_histogram ( )

return number of bits required to store all symbols in histogram

返回:
number of bits required to store all symbols in histogram

在文件Alphabet.cpp576行定义。

int32_t get_num_symbols ( ) const

get number of symbols in alphabet

返回:
number of symbols

在文件Alphabet.h137行定义。

int32_t get_num_symbols_in_histogram ( )

return number of symbols in histogram

返回:
number of symbols in histogram

在文件Alphabet.cpp564行定义。

void init_map_table ( ) [protected]

init map table

在文件Alphabet.cpp178行定义。

bool is_valid ( uint8_t  c)

check whether symbols are valid in alphabet e.g. for DNA if symbol is one of the A,C,G or T

参数:
csymbol
返回:
if symbol is a valid character in alphabet

在文件Alphabet.h235行定义。

void load_serializable_post ( void  ) throw (ShogunException) [protected, virtual]

Can (optionally) be overridden to post-initialize some member variables which are not PARAMETER::ADD'ed. Make sure that at first the overridden method BASE_CLASS::LOAD_SERIALIZABLE_POST is called.

异常:
ShogunExceptionWill be thrown if an error occurres.

重载CSGObject

在文件Alphabet.cpp718行定义。

void print_histogram ( )

print histogram

在文件Alphabet.cpp585行定义。

uint8_t remap_to_bin ( uint8_t  c)

remap element e.g translate ACGT to 0123

参数:
celement to remap
返回:
remapped element

在文件Alphabet.h157行定义。

uint8_t remap_to_char ( uint8_t  c)

remap element e.g translate 0123 to ACGT

参数:
celement to remap
返回:
remapped element

在文件Alphabet.h167行定义。

bool set_alphabet ( EAlphabet  alpha)

set alphabet and initialize mapping table (for remap)

参数:
alphanew alphabet

在文件Alphabet.cpp107行定义。

static void translate_from_single_order ( ST *  obs,
int32_t  sequence_length,
int32_t  start,
int32_t  p_order,
int32_t  max_val 
) [static]

translate from single order

参数:
obsobservation
sequence_lengthlength of sequence
startstart
p_orderorder
max_valmaximum value

在文件Alphabet.h286行定义。

static void translate_from_single_order ( ST *  obs,
int32_t  sequence_length,
int32_t  start,
int32_t  p_order,
int32_t  max_val,
int32_t  gap 
) [static]

translate from single order

参数:
obsobservation
sequence_lengthlength of sequence
startstart
p_orderorder
max_valmaximum value
gapgap

在文件Alphabet.h379行定义。

static void translate_from_single_order_reversed ( ST *  obs,
int32_t  sequence_length,
int32_t  start,
int32_t  p_order,
int32_t  max_val 
) [static]

translate from single order reversed

参数:
obsobservation
sequence_lengthlength of sequence
startstart
p_orderorder
max_valmaximum value

在文件Alphabet.h332行定义。

static void translate_from_single_order_reversed ( ST *  obs,
int32_t  sequence_length,
int32_t  start,
int32_t  p_order,
int32_t  max_val,
int32_t  gap 
) [static]

translate from single order reversed

参数:
obsobservation
sequence_lengthlength of sequence
startstart
p_orderorder
max_valmaximum value
gapgap

在文件Alphabet.h450行定义。


成员数据文档

EAlphabet alphabet [protected]

alphabet

在文件Alphabet.h551行定义。

const char * alphabet_names [static]
初始化序列:
{
    "DNA","RAWDNA", "RNA", "PROTEIN", "BINARY", "ALPHANUM",
    "CUBE", "RAW", "IUPAC_NUCLEIC_ACID", "IUPAC_AMINO_ACID",
    "NONE", "DIGIT", "DIGIT2", "RAWDIGIT", "RAWDIGIT2", "UNKNOWN",
    "SNP", "RAWSNP"}

alphabet names

在文件Alphabet.h536行定义。

const uint8_t B_0 = 4 [static]

B_0

在文件Alphabet.h532行定义。

const uint8_t B_A = 0 [static]

B_A

在文件Alphabet.h524行定义。

const uint8_t B_C = 1 [static]

B_C

在文件Alphabet.h526行定义。

const uint8_t B_G = 2 [static]

B_G

在文件Alphabet.h528行定义。

const uint8_t B_T = 3 [static]

B_T

在文件Alphabet.h530行定义。

int64_t histogram[1<< (sizeof(uint8_t)*8)] [protected]

histogram

在文件Alphabet.h563行定义。

uint8_t maptable_to_bin[1<< (sizeof(uint8_t)*8)] [protected]

maptable to bin

在文件Alphabet.h559行定义。

uint8_t maptable_to_char[1<< (sizeof(uint8_t)*8)] [protected]

maptable to char

在文件Alphabet.h561行定义。

const uint8_t MAPTABLE_UNDEF = 0xff [static]

MAPTABLE UNDEF

在文件Alphabet.h534行定义。

int32_t num_bits [protected]

number of bits

在文件Alphabet.h555行定义。

int32_t num_symbols [protected]

number of symbols

在文件Alphabet.h553行定义。

bool valid_chars[1<< (sizeof(uint8_t)*8)] [protected]

valid chars

在文件Alphabet.h557行定义。


该类的文档由以下文件生成:

SHOGUN Machine Learning Toolbox - Documentation