Physics307L F08:People/Mondragon/Formal Lab Report
Poisson Statistics of Gamma Ray Detection Events from the Gamma Ray Background
Tomas Mondragon, University of New Mexico, Albuquerque NM, tmondrag@unm.eduSJK 23:24, 7 December 2007 (CST)
Abstract
SJK 23:27, 7 December 2007 (CST)
The Poisson distribution is a probability function that often shows up in counting experiments where events are being counted that happen randomly but at a definite average rate. In this experiment, we count gamma ray detection events. The diffuse gamma ray background ensures that the Earth is bathed with a steady stream of gamma rays in every direction, though, the galactic plane, active galactic nuclei, and pulsars can be brighter than the background. Over a course of a couple of hours the intensity should not vary by much except if a pulsar jet sweeps by the Earth or a solar flare erupts in the direction of Earth. Thus, as long as our detection instruments maintain a constant sensitivity, we should detect gamma rays at random times but at a steady average rate. The data we collect should conform to the Poisson mass probability function. After several counting experiments at a dwell time, the Poisson distribution function can be used to estimate an expectation number for the number of events which will be detected in a dwell time. After repeating this with different dwell times, we calculate the average gamma ray detection rate.
Introduction
SJK 23:45, 7 December 2007 (CST)
The Poisson distribution is a probability function that often shows up in counting experiments where either events are being counted that happen randomly but at a definite average rate or objects are being counted that are distributed randomly throughout a space but with a definite average number density. It describes the probability of counting [math]\displaystyle{ k }[/math] events or objects in a discrete chunk of time or space when [math]\displaystyle{ \lambda }[/math] is the expected number of events or objects in an equivalently sized discrete chunk of time or space with the same average event rate or object number density.
If Ω is the average event rate or object number density, and η is the size of the discrete chunk of time or space, [math]\displaystyle{ \lambda=\Omega*\eta }[/math]
The probability that [math]\displaystyle{ k }[/math] events or objects in will be counted in discrete chunk of time or space of size η is [math]\displaystyle{ P(k;\lambda)=\frac{\lambda^k e^{-\lambda}}{k!} }[/math]
In this experiment, we count gamma ray detection events. The diffuse gamma ray background ensures that the Earth is bathed with a steady stream of gamma rays in every direction, though, the galactic plane, active galactic nuclei, and pulsars can be brighter than the background. Over a course of a couple of hours the intensity should not vary by much except if a pulsar jet sweeps by the Earth or a solar flare erupts in the direction of Earth. Thus, as long as our detection instruments maintain a constant sensitivity, we should detect gamma rays at random times but at a steady average rate. The data we collect should conform to the Poisson mass probability function.
Methods
To conduct a series of counting experiments where a very low to moderately high count number is expected, we used a gamma ray detector which was a NaI(Th) scintillator attached to a PMT with adjustable sensitivity encased in an aluminum housing. The detector unit we used was a Harshaw NaI(Tl) Integral Line scintillation detector type 12S12 (Serial# EG 320 UNM Property #149864). The PMT base is is an Ortec Model 266 PM base (Serial# 1354 Rev # 10 UNM Property # 149895). The aluminum housing had the function of blocking α and β radiation. We placed the detector unit in a lead cave which contained no obvious gamma sources to reduce the expected count rate.
The sensitivity of the PMT in the detector unit could be adjusted by adjusting the voltage level output of the high voltage power supply to the PMT. Our power supply was a Betran Associates, Inc Model 315 DC Power Supply. To further reduce the count rate, we kept the voltage supply at about 1000 V throughout the experiment.
The event counter we used was a MCA card inside one of the lab computers. Our MCA was an Oxford Instruments Inc. Third Generation Personal Computer Analyzer Version 2.42. The MCA we were using required voltage pulses larger than what the PMT output, so we sent the output through an amplifier to shape the pulses appropriately. We used a Canberra Model 814 Preamp/Amp/Disc. To get the MCA to bin events according to detection time rather than by the pulse height the PMT and amplifier output, we shorted the MCA/REJ pin to the SCA pin on the exterior part of the card with a breakout/hydra cable.
We set up the computer we were storing our count data to put the data in 256 bins so that the whole detecting and counting apparatus would run 256 counting experiments every time we did a data collection run. In our first data collection run, we set the dwell time to 10ms so the apparatus would count the gamma ray detection events that happened in a 10ms time period until it had performed this experiment 256 times. We repeated this for dwell times of 20ms,40ms, 80ms, 100ms, 200ms, 400ms, 800ms, 1s, and 10s.
Results
After every data collection run, I saved the data collected by the MCA into a separate ASCII text file. Examples can be found in my notebook entry about this experiment. These text files contain data on how many events where counted while the MCA dwelled on a channel and which Region of Interest (ROI) that channel lied in. My concern was with how often and with what probability a certain number of events could be expected to occur during a set dwell time. So for each run, I counted how many times a number x of events were counted in a dwell time window during that run. That data is plotted in the following graphic.
According to best guesses about what events the scintillator/PMT apparatus would detect at low sensitivity while inside that lead cave, the probability of counting [math]\displaystyle{ k }[/math] events in dwell time [math]\displaystyle{ \Delta t }[/math] if the event happened at random at the average rate [math]\displaystyle{ \beta }[/math] would be
[math]\displaystyle{ f(k;\lambda)=\frac{\lambda^k e^{-\lambda}}{k!} }[/math]
where [math]\displaystyle{ \lambda }[/math] is the expected value of counts, in other words, [math]\displaystyle{ \lambda=\beta\Delta t }[/math]. This is the Poisson probability density function, which often shows up in counting experiments where the things being counted are events that occur at a random at a definite average rate or objects placed randomly but with a definite average density in a space.
In order to verify this I had to normalize my data and find out if the Poisson probability density function for the right value of [math]\displaystyle{ \lambda }[/math] fit the data reasonably well, and if these [math]\displaystyle{ \lambda }[/math]'s fit the relation [math]\displaystyle{ \lambda=\beta\Delta t }[/math].
To find a [math]\displaystyle{ \lambda }[/math] for each data set so that [math]\displaystyle{ f(k;\lambda) }[/math] fits my normalized data as best possible, I used a least squares regression technique to find an appropriate [math]\displaystyle{ lambda }[/math] to ensure that the datapoints were as close to the fit as possible.
least squares regression
Given a data set [math]\displaystyle{ (x_i,y_i) }[/math] and a family of functions [math]\displaystyle{ y=f(m;x) }[/math], one of those functions will fit the data so well that the sum of the differences between the data [math]\displaystyle{ y_i }[/math] and the value [math]\displaystyle{ y(x_i)=f(m;x_i) }[/math] is a minimum. Finding the minimum of
[math]\displaystyle{ \sum_i{|f(m;x_i)-y_i|} }[/math] may not be as easy as finding the minimum of [math]\displaystyle{ \Upsilon^2(m)=\sum_i{(f(m;x_i)-y_i)^2} }[/math], though they have the same minimum, so minimization of [math]\displaystyle{ \Upsilon^2(m) }[/math] is most often used to find the parameter [math]\displaystyle{ m }[/math].
The usage of [math]\displaystyle{ \Upsilon^2(m) }[/math] minimization can also be motivated by using the maximum likelihood method of curve fitting. There, finding the [math]\displaystyle{ m }[/math] that minimizes [math]\displaystyle{ \chi^2(m)=\frac{\sum_i{(f(m;x_i)-y_i)^2}}{\sigma^2} }[/math] maximizes the likelihood the dataset could have been produced by a process modeled by [math]\displaystyle{ y=f(m;x) }[/math]. [math]\displaystyle{ \sigma^2 }[/math] is just a constantSJK 13:14, 4 December 2007 (CST)
, so [math]\displaystyle{ \Upsilon^2(m)\varpropto\chi^2(m) }[/math].
My set of data points were [math]\displaystyle{ (k,x_k) }[/math], where [math]\displaystyle{ k }[/math] is the event count number and [math]\displaystyle{ x_k }[/math] is the frequency count of how many times [math]\displaystyle{ k }[/math] events were counted in a dwell time. The event detection process could be modelled by the family of Poisson Probability Density functions [math]\displaystyle{ y=f(\lambda;k) }[/math], where [math]\displaystyle{ y }[/math] is the probability that [math]\displaystyle{ k }[/math] events will be counted during a dwell. The [math]\displaystyle{ x_k }[/math]'s needed to be scaled down by a factor N so that [math]\displaystyle{ \sum_k{\frac{x_k}{N}}=1 }[/math] just as[math]\displaystyle{ \sum_k{f(\lambda;k)}=1 }[/math]. N=256, The number of channels we used to conduct a run.
[math]\displaystyle{ \Upsilon^2(\lambda)= \sum_k{\left(f(\lambda;k)-\tfrac{x_k}{256}\right)^2} }[/math]
Since I couldn't really come up with a formula I could easily use to find minimum of [math]\displaystyle{ \Upsilon^2(\lambda) }[/math], I had to do it graphically. Despite the complexity of the Poisson function, [math]\displaystyle{ \Upsilon^2(\lambda) }[/math] was always parabolic around the minimum. For example, here are plots of [math]\displaystyle{ \chi^2(\lambda) }[/math] versus [math]\displaystyle{ \lambda }[/math] for the data I collected for 10s dwell times and 10ms dwell times.
SJK 13:20, 4 December 2007 (CST)
-
minimum hidden near [math]\displaystyle{ \lambda }[/math]=0
-
[math]\displaystyle{ \lambda_{best} }[/math] is where [math]\displaystyle{ \chi^2 }[/math] is a minimum
-
minimum more visible at this scale when dwell time=400ms
-
[math]\displaystyle{ \lambda_{best} }[/math] is where [math]\displaystyle{ \chi^2 }[/math] is a minimum
-
minimum easily visible
-
[math]\displaystyle{ \lambda_{best} }[/math] is where [math]\displaystyle{ \chi^2 }[/math] is a minimum
The green line in the above graphs mark [math]\displaystyle{ \operatorname{min}(\chi^2(\lambda))+1 }[/math] and where they intersect with [math]\displaystyle{ \chi^2 }[/math] are around [math]\displaystyle{ \lambda\pm\Delta\lambda }[/math]
[math]\displaystyle{ \lambda }[/math] and [math]\displaystyle{ \Delta\lambda }[/math]
There is a simpler way to estimate [math]\displaystyle{ \lambda }[/math] is to take advantage of the fact that [math]\displaystyle{ \lambda }[/math] is the expectation value of a Poisson distribution. Therefore the average of the event count values per bin should give an estimate of [math]\displaystyle{ \lambda }[/math]. To compare the methods I provide the values obtained from both methods and the average deviation of the normalized data from each fit.
Dwell time | [math]\displaystyle{ \lambda }[/math], [math]\displaystyle{ \Chi^2 }[/math] method | Uncertainty of [math]\displaystyle{ \lambda }[/math], [math]\displaystyle{ \Chi^2 }[/math] method | deviation from fit | [math]\displaystyle{ \lambda }[/math], data average method | deviation from fit |
---|---|---|---|---|---|
10 ms | 0.0261205 | 0.0010149 (3.87%) | 0.0013743 | 0.042969 | 0.0028351 |
20 ms | 0.0736898 | 0.0023908 (3.24%) | 0.0029991 | 0.10938 | 0.0052407 |
40 ms | 0.185892 | 0.007938 (4.27%) | 0.0082640 | 0.32422 | 0.015304 |
80ms | 0.457154 | 0.022045 (4.82%) | 0.0014991 | 0.70703 | 0.020735 |
100 ms | 0.468385 | 0.011267 (2.41%) | 0.0079525 | 0.62891 | 0.012870 |
200 ms | 0.937453 | 0.033847 (3.61%) | 0.013170 | 1.2070 | 0.016055 |
400 ms | 2.58578 | 0.08610 (3.33%) | 0.013153 | 2.7617 | 0.013404 |
800 ms | 5.74562 | 0.08421 (1.47%) | 0.0074780 | 6.0234 | 0.0078470 |
1 s | 7.27233 | 0.09070 (1.25%) | 0.0065399 | 7.3242 | 0.0065502 |
10 s | 73.9022 | 0.6182 (0.84%) | 0.0074556 | 73.766 | 0.0074574 |
-
normalized data and fits for dwell = 10 ms
-
normalized data and fits for dwell = 20 ms
-
normalized data and fits for dwell = 40 ms
-
normalized data and fits for dwell = 80 ms
-
normalized data and fits for dwell = 100 ms
-
normalized data and fits for dwell = 200 ms
-
normalized data and fits for dwell = 400 ms
-
normalized data and fits for dwell = 800 ms
-
normalized data and fits for dwell = 1 s
-
normalized data and fits for dwell = 10 s
SJK 13:26, 4 December 2007 (CST)
SJK 13:28, 4 December 2007 (CST)
[math]\displaystyle{ \lambda }[/math] vs. dwell time
Since theoretically, these events are being detected at a definite average rate, [math]\displaystyle{ \lambda }[/math] should have a first-order polynomial dependency on dwell time. Also theoretically, there should be no events observed for a dwell time of 0ms, so, [math]\displaystyle{ \frac{\lambda(\Delta t)}{\Delta t} =some constant }[/math] The later data shows this trend, but the earlier data does not. The following is a plot of the [math]\displaystyle{ \lambda }[/math] vs dwell times, and several fits obtained fro Micheal Gold's linefit matlab script. As error on my λ values I used the Δλ values I obtained earlier.
The fit that does not take into account the errors looks to be the better of the fits. When the error I computed comes into consideration, the earlier data points gain way more importance than the later data points, which is understandable given that the Δλ I calculated get larger towards the end and the close spacing of the earlier points. Looking at a logarithmic plot, the fit that ignores error values still looks like a better fit.SJK 13:54, 4 December 2007 (CST)
There ought to be an error contribution due to how little data is collected at small dwell times and the relatively short runs.
Argument for additional data volume based error
Theory of Probability by M.E. Munroe [1], in it's chapter on Poisson distributions points to Molina's Poisson's Exponential Binomial Limit (New York, 1942) and Molina's tables on the accuracy of Poisson distributions and the dependency on the number of trials.
Average counts per second
The fit that ignores the errors is [math]\displaystyle{ \lambda = 0.007411*dwelltime(ms)-0.209179 }[/math], meaning that with our setup, we were detecting 0.007411 events per ms, or 7.411 events per second.
Discussion
The Poisson distribution fit our event count data very well without exceeding an average deviation of 4 counts per channel from the fit. However, a deviation that large can have a very noticeable effect on a Poisson distribution with a small [math]\displaystyle{ \lambda }[/math] and only 256 counting experiment results to share in the distribution. This is not reflected in my calculation of [math]\displaystyle{ \Delta\lambda }[/math] for each fit. The [math]\displaystyle{ \Delta\lambda }[/math] I calculated seems to shrink as [math]\displaystyle{ \lambda }[/math] does. The fact that the best looking result for the average event detection rate came from a fit that ignored the [math]\displaystyle{ \Delta\lambda }[/math] I calculated reflects that [math]\displaystyle{ \Delta\lambda }[/math] was most probably constant.
One interesting thing I should note is the instance when the equipment counted 27 events in a 200 ms period. [math]\displaystyle{ \lambda \approx 0.937453 }[/math] for a 200ms dwell time, so, the probability of detecting 27 events in a 200 ms dwell time is [math]\displaystyle{ 6.2882*10^{-30} }[/math]
The events we were counting, from the appearance of the data taken by Knockel and Joseph (who had a very similar setup) does not appear to have an average count rate that fluctuates with time - at least for a period of several days. whatever the source of the events. The fluctuations in the average count in their data probably has more to do with changes in the sensitivity of the scintillator/PMT apparatus, which can be caused by changes in the voltage supplied to the PMT. This could have been in turn caused by the unreliable ground wiring in the Physics building. Professor Koch mentioned Dr. Geremia found evidence of this.
Conclusion
The Poisson probability mass distribution fit the data we collected very well, and the assertion that the event we were detecting happened at random but at an average rate. The expectation values for the fits for a clear linear relation with the dwell times. The equipment we were using was detecting gamma rays at a rate of 7.411 events per second.
Side by side comparison of two methods of estimating the expectation values of the fits revealed that the [math]\displaystyle{ \Chi^2 }[/math] method and the data averaging method both produced fit which did not deviate from the data by huge amounts. The [math]\displaystyle{ \Chi^2 }[/math] method produced tighter fits consistently, but only because the method could be used to estimate the expectation value to arbitrary precision, where as the precision of the data averaging is limited by the number of counting experiments done for a dwell time.
My knowledge of the limits on how tightly a Poisson distribution can fit this type of data is woefully inadequate. Perhaps I can look into this in future experiments, or try to find Molina's book on the matter.
Acknowledgments
I would like to thank the University of New Mexico's Physics department for allowing me to use their equipment to conduct this experiment.
I also wish to acknowledge Dr. Steven Koch, who supervised the lab and provided insight into where the fluctuations in average count data in Joseph and Knockel's data. He also found the user's manual for the MCA card we were using, which was a great help since I had forgotten how to use the software that came with it.
Lastly, I wish to acknowledge my lab partner Lorenzo Trujillo. He assisted greatly in the data collection and his experience with oscilloscopes proved very useful in the set up of this experiment and many others.
Koch Comments
- "Easy" things to add:
- Author & contact info
- Acknowledgments
- "Difficult" things to add or fix
- You have fascinating data analysis and comparison of various methods. Is fantastic for an informal report. However, as a polished formal report, is sometimes too informal.
- Expanded introduction with references