Data Management

Importing and exporting data

mlpy.data_fromfile(file, ytype=<type 'int'>)

Read data file in the form:

x11 [TAB] x12 [TAB] ... x1n [TAB] y1
x21 [TAB] x22 [TAB] ... x2n [TAB] y2
 .         .        .    .        .
 .         .         .   .        .
 .         .          .  .        .
xm1 [TAB] xm2 [TAB] ... xmn [TAB] ym

where xij are float and yi are of type ‘ytype’ (numpy.int or numpy.float).

Input

  • file - data file name
  • ytype - numpy datatype for labels (numpy.int or numpy.float)

Output

  • x - data [2D numpy array float]
  • y - classes [1D numpy array int or float]

Example:

>>> from numpy import *
>>> from mlpy import *
>>> x, y = data_fromfile('data_example.dat')
>>> x
array([[ 1.1,  2. ,  5.3,  3.1],
...    [ 3.7,  1.4,  2.3,  4.5],
...    [ 1.4,  5.4,  3.1,  1.4]])
>>> y
array([ 1, -1,  1])
mlpy.data_fromfile_wl(file)

Read data file in the form:

x11 [TAB] x12 [TAB] ... x1n [TAB]
x21 [TAB] x22 [TAB] ... x2n [TAB]
 .         .        .    .
 .         .         .   .       
 .         .          .  .       
xm1 [TAB] xm2 [TAB] ... xmn [TAB]

where xij are float.

Input

  • file - data file name

Output

  • x - data [2D numpy array float]

Example:

>>> from numpy import *
>>> from mlpy import *
>>> x, y = data_fromfile('data_example.dat')
>>> x
array([[ 1.1,  2. ,  5.3,  3.1],
...    [ 3.7,  1.4,  2.3,  4.5],
...    [ 1.4,  5.4,  3.1,  1.4]])
mlpy.data_tofile(file, x, y, sep='\t')

Write data file in the form:

x11 [sep] x12 [sep] ... x1n [sep] y1
x21 [sep] x22 [sep] ... x2n [sep] y2
 .         .        .    .        .
 .         .         .   .        .
 .         .          .  .        .
xm1 [sep] xm2 [sep] ... xmn [sep] ym

where xij are float and yi are integer.

Input

  • file - data file name
  • x - data [2D numpy array float]
  • y - classes [1D numpy array integer]
  • sep - separator
mlpy.data_tofile_wl(file, x, sep='\t')

Write data file in the form:

x11 [sep] x12 [sep] ... x1n [sep]
x21 [sep] x22 [sep] ... x2n [sep]
 .         .        .    .       
 .         .         .   .       
 .         .          .  .       
xm1 [sep] xm2 [sep] ... xmn [sep]

where xij are float.

Input

  • file - data file name
  • x - data [2D numpy array float]
  • sep - separator

Normalization

mlpy.data_normalize(x)

Normalize numpy array (2D) x.

Input

  • x - data [2D numpy array float]

Output

  • normalized data

Example:

>>> from numpy import *
>>> from mlpy import *
>>> x = array([[ 1.1,  2. ,  5.3,  3.1],
...            [ 3.7,  1.4,  2.3,  4.5],
...            [ 1.4,  5.4,  3.1,  1.4]])
>>> data_normalize(x)
array([[-0.9797065 , -0.48295391,  1.33847226,  0.12418815],
...    [ 0.52197912, -1.13395464, -0.48598056,  1.09795608],
...    [-0.75217354,  1.35919078,  0.1451563 , -0.75217354]])

Warning

Deprecated in version 2.3

mlpy.data_standardize(x, p=None)

Standardize numpy array (2D) x and optionally standardize p using mean and std of x.

Input

  • x - data [2D numpy array float]
  • p - optional data [2D numpy array float]

Output

  • standardized data

Example:

>>> from numpy import *
>>> from mlpy import *
>>> x = array([[ 1.1,  2. ,  5.3,  3.1],
...            [ 3.7,  1.4,  2.3,  4.5],
...            [ 1.4,  5.4,  3.1,  1.4]])
>>> data_standardize(x)
array([[-0.67958381, -0.43266792,  1.1157668 ,  0.06441566],
...    [ 1.1482623 , -0.71081158, -0.81536804,  0.96623494],
...    [-0.46867849,  1.1434795 , -0.30039875, -1.0306506 ]])

Warning

Deprecated in version 2.3. Use mlpy.standardize and mlpy.standardize_from instead

mlpy.standardize(x)

Standardize x.

x is standardized to have mean 0 and unit length by columns. Return standardized x, the mean and the standard deviation.

mlpy.center(y)

Center y to have mean 0.

Return centered y.

mlpy.standardize_from(x, mean, std)

Standardize x using external mean and standard deviation.

Return standardized x.

mlpy.center_from(y, mean)

Center y using external mean.

Return centered y.

Table Of Contents

Previous topic

Feature List Analysis

Next topic

Miscellaneous

This Page