User:Richard T. Meyers/Notebook/Phys307l/Poisson Statistics Lab: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
Line 54: Line 54:
I then plotted the Data-Quantile versus the Quantile to see if they were linear.  By definition, taken from Wiki, cited below, for the graph to be normal or effectively Poisson the Q-Q plot should be linear.  By observation neither the 100pre nor the 40pre are linear thus not Normal or Poisson.  Furthermore if one uses the standard deviation and mean calculated in the google docs spreadsheet.  We notice on the 40pre that there are 6 points outside the third deviation from the mean and 8 points outside three deviation from the mean for the 100pre.  For the 40pre this comes to inside of <math>0.3%,\,\!</math> but for the 100pre the 8 data points represent a probability outside of <math>0.3%,\,\!</math>.  This is unusual.   
I then plotted the Data-Quantile versus the Quantile to see if they were linear.  By definition, taken from Wiki, cited below, for the graph to be normal or effectively Poisson the Q-Q plot should be linear.  By observation neither the 100pre nor the 40pre are linear thus not Normal or Poisson.  Furthermore if one uses the standard deviation and mean calculated in the google docs spreadsheet.  We notice on the 40pre that there are 6 points outside the third deviation from the mean and 8 points outside three deviation from the mean for the 100pre.  For the 40pre this comes to inside of <math>0.3%,\,\!</math> but for the 100pre the 8 data points represent a probability outside of <math>0.3%,\,\!</math>.  This is unusual.   


Either way I have decided to discard the preamp data for this experiment, mostly because it is not normally distributed and is thus not a Poisson distribution.   
Either way I have decided to discard the preamp data for this experiment, mostly because it is not normally distributed and is thus not a Poisson distribution. ([[User:Steven J. Koch|Steve Koch]] 20:31, 21 December 2010 (EST):This is not necessarily true, but for large expected counts, it is.)  


The Next step is to go through the non preamp data and see if anything else should be discarded.  I see no data that should be discarded, additionally, however it should be noted that there is a large variation in the 800ms data that will introduce error but because it follows the Poisson distribution, albeit roughly, it can still be used.   
The Next step is to go through the non preamp data and see if anything else should be discarded.  I see no data that should be discarded, additionally, however it should be noted that there is a large variation in the 800ms data that will introduce error but because it follows the Poisson distribution, albeit roughly, it can still be used.   

Revision as of 18:31, 21 December 2010

Procedure

The online lab procedure is out of date but does still have useful information linked here. We did ask Professor Koch about the lab and found that much of it is just having the computer do the counting of the events for multiple time intervals this is easily done through the software. Once the data is gathered we need only analyze it with histograms. This will be done in the lab.

High Voltage Spectrometer
The Photo Multiplier Tube

Data

Note Blue is the Data Overlay, Red is the Poisson Overlay and Orange is the Gaussian Overlay

{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdF9KclhmZzI2bGhEbjhSUk5vdUlHQUE |width=830 |height=700 }}

{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdDFDQ0VsTEdOeGlhUGtXX1h0cUREa2c |width=830 |height=700 }}

{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdERoNmxRSE05dHgtWDRJcmZxbzZQVXc |width=830 |height=700 }}

{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdEFLQjVZVFNLMVZqeHhRLUNlcmNpSGc |width=830 |height=700 }}

{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdElPZTFWLV9NOV9ZRmp5d1QtWElTUHc |width=830 |height=700 }}

{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdG45dVU2YWUzY3BLbmlOaVhDZFNlNlE |width=830 |height=700 }}

{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdFZIV09vSXpRdUVIU3EteC1MZWV4MFE |width=830 |height=700 }}

{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdHZRbFFkN0FaNVJLSHRwUHNmYS1RRlE |width=830 |height=700 }}

{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdFRlSWFBb2w0WTNXWXZEVXV3eGJleWc |width=830 |height=700 }}

{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdEJxNkxLVlpwUXdSN1ZUVEJQNk9nWnc |width=830 |height=700 }}

Calculations

I looked up what the Poisson Function is:

[math]\displaystyle{ f(k; \lambda)=\frac{\lambda^k e^{-\lambda}}{k!},\,\! }[/math]

Where [math]\displaystyle{ k }[/math] is the number of events per time interval and where [math]\displaystyle{ \lambda }[/math] is the expected value of the number of events per interval.

I looked up what the Gaussian Function is:

[math]\displaystyle{ f(x)=\frac{1}{\sqrt{2 \pi \sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}},\,\! }[/math]

Where [math]\displaystyle{ \mu }[/math] is the mean, [math]\displaystyle{ \sigma }[/math] is the standard deviation and [math]\displaystyle{ x }[/math] is the number of occurrences at a point.

Also for the Preamp plots, 40pre and 100 pre, I used a Q-Q Plot to determine if they were normally distributed.

The Quantiles being defined as n being the sample size, N being the number of intervals at k, and k the point in the sample

The Quantile column is: [math]\displaystyle{ \sum_{k=0}^{n}\frac{k}{n} }[/math]

The Data-Quantile column is: [math]\displaystyle{ \sum_{k=0}^{n}\frac{N(k)}{n} }[/math]

I then plotted the Data-Quantile versus the Quantile to see if they were linear. By definition, taken from Wiki, cited below, for the graph to be normal or effectively Poisson the Q-Q plot should be linear. By observation neither the 100pre nor the 40pre are linear thus not Normal or Poisson. Furthermore if one uses the standard deviation and mean calculated in the google docs spreadsheet. We notice on the 40pre that there are 6 points outside the third deviation from the mean and 8 points outside three deviation from the mean for the 100pre. For the 40pre this comes to inside of [math]\displaystyle{ 0.3%,\,\! }[/math] but for the 100pre the 8 data points represent a probability outside of [math]\displaystyle{ 0.3%,\,\! }[/math]. This is unusual.

Either way I have decided to discard the preamp data for this experiment, mostly because it is not normally distributed and is thus not a Poisson distribution. (Steve Koch 20:31, 21 December 2010 (EST):This is not necessarily true, but for large expected counts, it is.)

The Next step is to go through the non preamp data and see if anything else should be discarded. I see no data that should be discarded, additionally, however it should be noted that there is a large variation in the 800ms data that will introduce error but because it follows the Poisson distribution, albeit roughly, it can still be used.

From the lab manual we find that the standard deviation should equal the square root of the mean:

[math]\displaystyle{ standard deviation=\sqrt{mean},\,\! }[/math]

10ms: stdev=0.54767796013852

20ms: stdev=0.78642612359645

40ms: stdev=1.11239409182849

80ms: stdev=1.58183385316939

100ms: stdev=1.7488917560004

200ms: stdev=2.45148330520545

400ms: stdev=3.46438365068596

800ms: stdev=4.90067440998601

Standard Deviation calculated in Google Docs

10ms: stdev=0.54594837315979

20ms: stdev=0.7775635504387

40ms: stdev=1.14761292926255

80ms: stdev=1.62063612220736

100ms: stdev=1.76461434607371

200ms: stdev=2.41024659670614

400ms: stdev=3.41192722689169

800ms: stdev=4.93394466988019

The Difference between the square root of the mean and the standard deviation.

10ms: Difference=0.001729586978738

20ms: Difference=0.008862573157752

40ms: Difference=0.03521883743406

80ms: Difference=0.038802269037964

100ms: Difference=0.01572259007331

200ms: Difference=0.04123670849931

400ms: Difference=0.052456423794275

800ms: Difference=0.033270259894178

Conclusions

I can see from the Graphs of the Poisson Overlay of each of the data sets, not the preamp sets, that they follow Poisson distributions. The preamp sets, from a Q-Q graph, are shown to not be Poisson. I also notice that as the time interval increases from 10ms to 800ms we tend to get a graph closer to a Gaussian rather than a Poisson. This just means that as the time interval increases the graphs go towards a normal distribution.

I overlay the Gaussian Distribution onto both the 400ms and 800ms plots. I noticed that they are both close to the data and the Poisson Distributions.

Lastly the differences between the square root of the mean and the standard deviation increase and the time intervals increase. So we can say the data is closer to a Poisson for short time intervals and closer to a Gaussian for larger time intervals.

Citation

1) I got the information for the Q-Q plot here

2) I got the information for the Poisson Distribution here

3) I got the information for the Gaussian Distribution here

Thanks

1)To Steve Koch for assistance in the lab specifically telling me about the COUNTIF function

2)To Katie Richardson for assistance with google docs and the amusing Chicago Piano Tuners Problem.

3)To Nathan for assistance in the lab and being a good lab partner.