# Procedure for Error Propagation

- In the following procedure the standard deviation of the mean is calculated according to the formula

- [math]\sigma_m= \sqrt{\frac{1}{N*(N-1)} \sum_{i=1}^N (x_i - \overline{x})^2}\,[/math]

- It is only recently that I received a copy of Taylor's informative text on error analysis. I have chosen to apply the methods of Chapter 8 (Least Squares Fitting) to my data and so I have redone this section of the error analysis completely. It will help me as well as the reader to explain the steps that will be taken.

**1.)** I begin by assuming a normal distribution to the stopping potential data following the relation y = A + Bx, which is going to be a very narrow distribution judging by the low deviation of the voltage data for each frequency. If we assume no substantial errors in the x variable the probability of obtaining the observed values of y (in my case eV) will be proportional to the Gaussian:

- [math] Prob_{A,B}(y_1,...,y_N) \propto \frac{1}{\sigma_y ^N} e^{-\chi^{2}/2} [/math]

where [math] \chi^2 = \frac{\left(\sum_{i=1}^N (y_i - A - Bx_i)^2\right)}{\sigma_y ^2} [/math]

and [math] y = eV [/math], [math] A = -\omega_0 [/math], [math] B = h [/math] for this experiment.

**2.)** Since the best estimates for A and B are those values at which *ProbAB(y1,...,yN)* is maximum, or identically for which Χ^2 is a minimum, the least square fits for A and B (-ω_0 and h) are found by taking the derivative of X^2 with respect to A and B separately and then solving the resulting two equations for A and B. In the end

- [math] A = \frac{\sum_{i=1}^N x_i^2 \sum_{i=1}^N y_i - \sum_{i=1}^N x_i \sum_{i=1}^N x_iy_i}{\Delta}[/math]
- [math] B = \frac{N\sum_{i=1}^N x_iy_i - \sum_{i=1}^N x_i \sum_{i=1}^N y_i}{\Delta}[/math]

where [math] \Delta = N\sum_{i=1}^N x_i^2 - \left(\sum_{i=1}^N x_i\right)^2 [/math]

I choose to perform this step using the MATLAB functions *polyfit* to find the coefficients (A & B) and *polyval* to evaluate them at at the data points for plotting.

**3.)** With the least squares coefficients determined I can proceed to calculating the errors of y (eV), A (-ω_0), and B (h). Taylor shows that the uncertainty for y (with x assumed to be without error) is

- [math]\sigma_y= \sqrt{\frac{1}{N-2} \sum_{i=1}^N (y_i - A-Bx_i)^2}\,[/math]

I use this definition but divide by an extra N to make it the SDOM. This formula is generally similar to the familiar SDOM formula quoted above, except that it has a factor of N-2 in the denominator. This difference results from the fact that, in calculating uncertainty, we divide by the number of independent measured values. When we have to calculate the average of one quantity before the uncertainty we are left with N-1 independent measured values, or*degrees of freedom*, but when we have to calculate the average of two quantities (A & B) then we are left with N-2 *degrees of freedom*. Taylor says that there are good statistical reasons to divide by the number of degrees of freedom instead of the number of measurements, but does not justify it mathematically. Qualitatively, though, this makes sense, for were we to measure just two pairs of data (x1,y1) and (x2,y2) with N=2 we could always get a straight line through two points so the idea of a linear fit for two points should be undefined, which is in this case a division by 0.

**4.)** With the uncertainty in y determined I finally calculate the uncertainties in the quantities that I am really interested in, A and B, or rather ω_0 and h. Taylor explains that from the two expressions for A and B above we can apply the general error propagation formula (summing partial derivatives times their respective uncertainties in quadrature) to arrive at the uncertainties for A and B (I repeat that I use the extra 1/N factor in my σ_y):

- [math] \sigma_A = \sigma_y\sqrt{\frac{\sum_{i=1}^N x_i^2}{\Delta}}[/math]
- [math] \sigma_B = \sigma_y\sqrt{\frac{N}{\Delta}}[/math]

where once more [math]\Delta = N\sum_{i=1}^N x_i^2 - \left(\sum_{i=1}^N x_i\right)^2 [/math]

**5.)** In obtaining my final results I calculate the standard error of the mean (stdm) for each set of data, then average these errors for the total stdm of ω_0 and h.