2.11 Function Fitting

It is possible to fit functional forms to data points read from files by using the fit command. A simple example might be:1

f(x) = a*x+b
fit f() 'data.dat' index 1 using 2:3 via a,b

The first line specifies the functional form which is to be used. The coefficients within this function which are to be varied during the fitting process are listed after the keyword via in the fit command. The modifiers index, every and using have the same meanings here as in the plot command.2 For example, given the following data file which contains a sampled square wave, entitled “square.dat”:

    0.314159          1
    0.942478          1
    1.570796          1
    2.199115          1
    2.827433          1
    3.455752         -1
    4.084070         -1
    4.712389         -1
    5.340708         -1
    5.969026         -1

the following script fits a truncated Fourier series to it. The output can be found in Figure 2.3.

f(x) = a1*sin(x) + a3*sin(3*x) + a5*sin(5*x)
fit f() 'square.dat' via a1, a3, a5
set xlabel '$x$' ; set ylabel '$y$'
plot 'square.dat' title 'data' with points pointsize 2, \
     f(x) title 'Fitted function' with lines
\includegraphics{examples/eps/ex_fitting.eps}
Figure 2.3: The output from a script that fits a truncated Fourier series to a sampled square wave. Even with only three terms the Gibbs pheonomenon is becoming apparent (see http://en.wikipedia.org/wiki/Gibbs_phenomenon for an explanation).

This is useful for producing best-fit lines3, and also has applications for estimating the gradients of datasets. The syntax is essentially identical to that used by Gnuplot, though a few points are worth noting:

At the end of the fitting process, the best-fitting values of each parameter are output to the terminal, along with an estimate of the uncertainty in each. Additionally, the Hessian, covariance and correlation matrices are output in both human-readable and machine-readable formats, allowing a more complete assessment of the probability distribution of the parameters.

Footnotes

  1. In Gnuplot, this example would have been written fit f(x) ..., rather than fit f() .... This syntax is supported in PyXPlot, but is deprecated.
  2. The select modifier, to be introduced in Section 4.3 can also be used.
  3. Another way of producing best-fit lines is to use a cubic spline; more details are given in Section 6.2