User:Richard T. Meyers/Notebook/Phys307l/Poisson Statistics Lab: Difference between revisions
No edit summary |
|||
Line 54: | Line 54: | ||
I then plotted the Data-Quantile versus the Quantile to see if they were linear. By definition, taken from Wiki, cited below, for the graph to be normal or effectively Poisson the Q-Q plot should be linear. By observation neither the 100pre nor the 40pre are linear thus not Normal or Poisson. Furthermore if one uses the standard deviation and mean calculated in the google docs spreadsheet. We notice on the 40pre that there are 6 points outside the third deviation from the mean and 8 points outside three deviation from the mean for the 100pre. For the 40pre this comes to inside of <math>0.3%,\,\!</math> but for the 100pre the 8 data points represent a probability outside of <math>0.3%,\,\!</math>. This is unusual. | I then plotted the Data-Quantile versus the Quantile to see if they were linear. By definition, taken from Wiki, cited below, for the graph to be normal or effectively Poisson the Q-Q plot should be linear. By observation neither the 100pre nor the 40pre are linear thus not Normal or Poisson. Furthermore if one uses the standard deviation and mean calculated in the google docs spreadsheet. We notice on the 40pre that there are 6 points outside the third deviation from the mean and 8 points outside three deviation from the mean for the 100pre. For the 40pre this comes to inside of <math>0.3%,\,\!</math> but for the 100pre the 8 data points represent a probability outside of <math>0.3%,\,\!</math>. This is unusual. | ||
Either way I have decided to discard the preamp data for this experiment, mostly because it is not normally distributed and is thus not a Poisson distribution. | Either way I have decided to discard the preamp data for this experiment, mostly because it is not normally distributed and is thus not a Poisson distribution. ([[User:Steven J. Koch|Steve Koch]] 20:31, 21 December 2010 (EST):This is not necessarily true, but for large expected counts, it is.) | ||
The Next step is to go through the non preamp data and see if anything else should be discarded. I see no data that should be discarded, additionally, however it should be noted that there is a large variation in the 800ms data that will introduce error but because it follows the Poisson distribution, albeit roughly, it can still be used. | The Next step is to go through the non preamp data and see if anything else should be discarded. I see no data that should be discarded, additionally, however it should be noted that there is a large variation in the 800ms data that will introduce error but because it follows the Poisson distribution, albeit roughly, it can still be used. |
Revision as of 18:31, 21 December 2010
Procedure
The online lab procedure is out of date but does still have useful information linked here. We did ask Professor Koch about the lab and found that much of it is just having the computer do the counting of the events for multiple time intervals this is easily done through the software. Once the data is gathered we need only analyze it with histograms. This will be done in the lab.
Data
Note Blue is the Data Overlay, Red is the Poisson Overlay and Orange is the Gaussian Overlay
{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdF9KclhmZzI2bGhEbjhSUk5vdUlHQUE |width=830 |height=700 }}
{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdDFDQ0VsTEdOeGlhUGtXX1h0cUREa2c |width=830 |height=700 }}
{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdERoNmxRSE05dHgtWDRJcmZxbzZQVXc |width=830 |height=700 }}
{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdEFLQjVZVFNLMVZqeHhRLUNlcmNpSGc |width=830 |height=700 }}
{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdElPZTFWLV9NOV9ZRmp5d1QtWElTUHc |width=830 |height=700 }}
{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdG45dVU2YWUzY3BLbmlOaVhDZFNlNlE |width=830 |height=700 }}
{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdFZIV09vSXpRdUVIU3EteC1MZWV4MFE |width=830 |height=700 }}
{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdHZRbFFkN0FaNVJLSHRwUHNmYS1RRlE |width=830 |height=700 }}
{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdFRlSWFBb2w0WTNXWXZEVXV3eGJleWc |width=830 |height=700 }}
{{#widget:Google Spreadsheet |key=0ArI06ZBK1lTAdEJxNkxLVlpwUXdSN1ZUVEJQNk9nWnc |width=830 |height=700 }}
Calculations
I looked up what the Poisson Function is:
- [math]\displaystyle{ f(k; \lambda)=\frac{\lambda^k e^{-\lambda}}{k!},\,\! }[/math]
Where [math]\displaystyle{ k }[/math] is the number of events per time interval and where [math]\displaystyle{ \lambda }[/math] is the expected value of the number of events per interval.
I looked up what the Gaussian Function is:
- [math]\displaystyle{ f(x)=\frac{1}{\sqrt{2 \pi \sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}},\,\! }[/math]
Where [math]\displaystyle{ \mu }[/math] is the mean, [math]\displaystyle{ \sigma }[/math] is the standard deviation and [math]\displaystyle{ x }[/math] is the number of occurrences at a point.
Also for the Preamp plots, 40pre and 100 pre, I used a Q-Q Plot to determine if they were normally distributed.
The Quantiles being defined as n being the sample size, N being the number of intervals at k, and k the point in the sample
The Quantile column is: [math]\displaystyle{ \sum_{k=0}^{n}\frac{k}{n} }[/math]
The Data-Quantile column is: [math]\displaystyle{ \sum_{k=0}^{n}\frac{N(k)}{n} }[/math]
I then plotted the Data-Quantile versus the Quantile to see if they were linear. By definition, taken from Wiki, cited below, for the graph to be normal or effectively Poisson the Q-Q plot should be linear. By observation neither the 100pre nor the 40pre are linear thus not Normal or Poisson. Furthermore if one uses the standard deviation and mean calculated in the google docs spreadsheet. We notice on the 40pre that there are 6 points outside the third deviation from the mean and 8 points outside three deviation from the mean for the 100pre. For the 40pre this comes to inside of [math]\displaystyle{ 0.3%,\,\! }[/math] but for the 100pre the 8 data points represent a probability outside of [math]\displaystyle{ 0.3%,\,\! }[/math]. This is unusual.
Either way I have decided to discard the preamp data for this experiment, mostly because it is not normally distributed and is thus not a Poisson distribution. (Steve Koch 20:31, 21 December 2010 (EST):This is not necessarily true, but for large expected counts, it is.)
The Next step is to go through the non preamp data and see if anything else should be discarded. I see no data that should be discarded, additionally, however it should be noted that there is a large variation in the 800ms data that will introduce error but because it follows the Poisson distribution, albeit roughly, it can still be used.
From the lab manual we find that the standard deviation should equal the square root of the mean:
[math]\displaystyle{ standard deviation=\sqrt{mean},\,\! }[/math]
10ms: stdev=0.54767796013852
20ms: stdev=0.78642612359645
40ms: stdev=1.11239409182849
80ms: stdev=1.58183385316939
100ms: stdev=1.7488917560004
200ms: stdev=2.45148330520545
400ms: stdev=3.46438365068596
800ms: stdev=4.90067440998601
Standard Deviation calculated in Google Docs
10ms: stdev=0.54594837315979
20ms: stdev=0.7775635504387
40ms: stdev=1.14761292926255
80ms: stdev=1.62063612220736
100ms: stdev=1.76461434607371
200ms: stdev=2.41024659670614
400ms: stdev=3.41192722689169
800ms: stdev=4.93394466988019
The Difference between the square root of the mean and the standard deviation.
10ms: Difference=0.001729586978738
20ms: Difference=0.008862573157752
40ms: Difference=0.03521883743406
80ms: Difference=0.038802269037964
100ms: Difference=0.01572259007331
200ms: Difference=0.04123670849931
400ms: Difference=0.052456423794275
800ms: Difference=0.033270259894178
Conclusions
I can see from the Graphs of the Poisson Overlay of each of the data sets, not the preamp sets, that they follow Poisson distributions. The preamp sets, from a Q-Q graph, are shown to not be Poisson. I also notice that as the time interval increases from 10ms to 800ms we tend to get a graph closer to a Gaussian rather than a Poisson. This just means that as the time interval increases the graphs go towards a normal distribution.
I overlay the Gaussian Distribution onto both the 400ms and 800ms plots. I noticed that they are both close to the data and the Poisson Distributions.
Lastly the differences between the square root of the mean and the standard deviation increase and the time intervals increase. So we can say the data is closer to a Poisson for short time intervals and closer to a Gaussian for larger time intervals.
Citation
1) I got the information for the Q-Q plot here
2) I got the information for the Poisson Distribution here
3) I got the information for the Gaussian Distribution here
Thanks
1)To Steve Koch for assistance in the lab specifically telling me about the COUNTIF function
2)To Katie Richardson for assistance with google docs and the amusing Chicago Piano Tuners Problem.
3)To Nathan for assistance in the lab and being a good lab partner.