6.5 Foreach Datum Loops

Foreach datum loops are similar to foreach loops in that they run a script block once for each item in a list. In this case, however, the list in question is the list of datapoints in a file. The syntax of the foreach datum command is similar to that of the commands met in the previous chapter for acting upon datafiles: the standard modifiers every, index, select and using can be used to select which columns of the datafile, and which subset of the datapoints, should be used:

foreach datum i,j,name in "data.dat" using 1:2:"%s"%($3)
 {
  ...
 }

The foreach datum command is followed by a comma-separated list of the variable(s) which are to be read from the datafile on each iteration of the loop. The using modifier specifies the columns or rows of data which are to be used to set the values of each variable. In this example, the third variable, name, is set using a quoted string, indicating that it will be set to equal whatever string of text is found in the third column of the datafile.

Calculating the Mean and Standard Deviation of Data.

The following PyXPlot script calculates the mean and standard deviation of a set of datapoints using the foreach datum command:

N_data = 0
sum_x = 0
sum_x2 = 0

foreach datum x in ’–’
{
N_data = N_data + 1
sum_x = sum_x + x
sum_x2 = sum_x2 + x**2
}
1.3
1.2
1.5
1.1
1.3
END

mean = sum_x / N_data
SD = sqrt(sum_x2 / N_data - mean**2)

print "Mean = %s"%(mean)
print "SD = %s"%(SD)

For the data supplied, a mean of $1.28$ and a standard deviation of $0.133$ are returned.