PHYSICS 542

Applications of Numerical Methods in Physics

R. J. Wilkes

Exercises #6 (there is no #5)

(Not to be handed in; answers with next set)

1. Fit a parabola through the following data set:

x                      y                      sy

-0.6                  5.0                   2.0

-0.2                  3.0                   1.0

0.2                   5.0                   1.0

0.6                   8.0                   2.0

Computer Demonstration

On the class website, go to exercises/, and download the code for  fitdemo.f, intgdemo.f, medfitdemo.f and fit.dat into your filespace.

Compile fitdemo.f and medfitdemo.f on UW's homer or dante linux machines:
>  /usr/bin/gfortran -o fitdemo fitdemo.f
> /usr/bin/gfortran -o medfitdemo medfitdemo.f
(the option is lower case "oh", not zero)

or on any machine that has gnu fortran (open source) g77:

>   gf77 -o fitdemo fitdemo.f

Fitdemo uses a data file that has the following format: line 1 = npts, followed by x,y,sigma, one data point per line. This program fits the data using linear least squares.

When the program asks for the filename, enter
fit.dat
Also try medfitdemo, a "robust fitting" procedure which minimizes the absolute deviations.

The sample data file FIT.DAT contains data obtained from the function f(x)=1-2x, with Gaussian errors added with sigma=0.10.  However, the last data point is a deliberately introduced outlier. Notice how the LSQ fit is strongly affected, while true to its name the robust fitter produces a good fit to the majority of the points.
You can copy the programs into your workspace, and you can try your own data file.

Answers to last week's (set #5):

1) The first plot below shows the two data sets, and (at x=7, on the far right) the averages of the two data sets. The average lifetimes are for A, 1.67+0.244, and for B, 0.748+0.039. Thus B differs from A by nearly 26 sigma! Not at all consistent.

2) For a chisq test: the mean masses differ by more than 6 sigma: 142.18+8.76,  vs 88.22+8.24.

For a run test: Merging the two sets, we find there are 24 runs, while for the numbers given the expected number of runs to be 30.87 with variance 14.6, a 1.8 sigma deviation. Using the large sample (gaussian) approximation to estimate CLs,  we find the probability that they are from the same population to be 3.6% - so we can reject consistency at better the 5% CL.

For a K-S test, see the 2nd plot below: this is my lazy plot with data points joined by straight lines, NOT horizontal and vertical lines as they should be, but you can see a Dmax of about 0.46, while Dmax for 28 data points at 5% CL is 0.34. Thus the KS test also rejects consistency at the 5% CL.

prob. 1:

K-S: