Revision as of 23:50, 19 October 2007

Poisson Statistics

experimentalists: me (princess bradley) and Nikipoo

Goal

By measuring random events that occur very rarely, I hope to analyze how well a Poisson and a Gaussian can fit the data. I believe the random events we are measuring to be cosmic radiation, but I am not sure. We are measuring whatever radiation can make its way through a "house" of lead bricks and then activate a NaI scintillator in the physics building at UNM.

Theory

The Poisson and Gaussian distributions are probability distributions. I will assume you know what probability distributions are. Whatever radiation we are measuring is randomly occurring, which is why I have chosen to analyze the data with the Poisson and the Gaussian (these distributions results from random events).

Poisson Distribution

When counting random events, the Poisson distribution is often used when the random events have a low probability of occurring. It is given by

[math]\displaystyle{ P(x)=e^{-a}\frac{a^x}{x!} }[/math],

where [math]\displaystyle{ a }[/math] is the mean, and [math]\displaystyle{ e^{-a} }[/math] is the normalization coefficient so that the sum of P(x) for every non-negative integer x is 1. Notice that the Poisson distribution is only defined for non-negative integers, so it is not continuous. The standard deviation of the Poisson is

[math]\displaystyle{ \sigma=\sqrt{a} }[/math].

According the the method of finding maximum likelihood, the best fit of a Poisson to data is to take the mean of the data to be [math]\displaystyle{ a }[/math]. That is, the mean of the Poisson distribution that is the best fit is also the mean of the data.

Gaussian Distribution

When counting random events, the Gaussian distribution is often used when there is a high probability of a random event occurring. It is given by

[math]\displaystyle{ G(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{\left(x-a\right)^2}{2\sigma^2}} }[/math],

where [math]\displaystyle{ a }[/math] is the mean, [math]\displaystyle{ \sigma }[/math] is the standard deviation, and 1/√([math]\displaystyle{ 2\pi\sigma^2 }[/math]) is the normalization coefficient so that the integral over all [math]\displaystyle{ x }[/math] ([math]\displaystyle{ -\infty }[/math] to [math]\displaystyle{ \infty }[/math]) is 1. The Gaussian distribution is continuous, so it is called a "probability density function" (pdf).

According the the method of finding maximum likelihood, the best fit of a Gaussian to data is to take the mean of the data to be [math]\displaystyle{ a }[/math] and the standard deviation of the data to be [math]\displaystyle{ \sigma }[/math]. That is, the mean and standard deviation of the Gaussian distribution that is the best fit is also the mean and standard deviation of the data.

Equipment

photomultiplier tube (PMT) with NaI scintillator
coaxial cables with BNC connectors
about 40 lead bricks (needed to built a house for the PMT)
high voltage power supply for the PMT (1000 V should do the trick)
a means of acquiring data from the PMT so that frequency can be measured accurately (this is the trickiest part!)

I have given somewhat general equipment needed since the specific equipment I am using will not affect the result of counting the signals from the PMT.

As for how a acquired data from the PMT, we used an amplifier to amplify the signal, which we connected to some chip in a computer, which works with really shitty software to, after changing many settings, measure frequency.

Our setup

We plugged in the high voltage power supply to the power outlet and then to the PMT using coaxial cables.
Without using any radioactive source, we built a lead house around the PMT using the bricks (I think we did this to prevent local radioactive sources from altering the data, but I'm not sure).
We then used a really weird contraption that connected the power supply to an amplifier to many other stuff to power an amplifier which we connected, with coaxial cables, to the PMT and the computer chip.
We then changed just about every setting that exists on the software (the shitty graphical interface made this difficult) to allow it to measure frequency (counts during a predetermined unit of time) and to allow it to put this frequency DATUM into a "bin." The program will fill many of these bins over a long period of time before stopping.

Procedure

The procedure was wonderfully easy, but took some time.

The data software has what is called a "dwell time," which is the predetermined unit of time that is associated with one bin. For smaller dwell times, the frequency data in the bins should become smaller making the Poisson a better choice. To study how well the Poisson and Gaussian could fit the data depending on dwell time, we used a wide range of dwell times: 10ms, 100ms, 1s, 10s, and 100s.

For dwell times 10ms, 100ms, and 100s, we had 4096 bins. For dwell times 1s and 10s, we only had 256 bins since more bins would have taken more time than we had (we could do 4096 bins for the 100s dwell time since we let the experiment run over the weekend). More bins is better since more bins will "smooth out the bumps" on the probability distribution from the data according the the law of large numbers (the formula for standard error of the mean also reveals that more bins gives smaller error).

Data and Results

When counting random events, the uncertainty of one frequency DATUM is √(frequency). A similar statement is that the standard deviation ([math]\displaystyle{ \sigma }[/math]) of a group of data can be approximated by √(a), where [math]\displaystyle{ a }[/math] is the mean. In fact, for a Poisson distribution,

[math]\displaystyle{ \sigma_{Poisson}=\sqrt{a} }[/math],

which I mentioned in my "Theory" section.

For this "Results" section, in addition to providing raw data plots, the mean, and the standard deviation, I will also give [math]\displaystyle{ \sigma_{Poisson} }[/math] and relative error between [math]\displaystyle{ \sigma_{Poisson} }[/math] and [math]\displaystyle{ \sigma }[/math]. This analysis of the Poisson's standard deviation is worthwhile since since the Poisson distribution approaches the Gaussian distribution when counting random events and when frequency becomes large.

Since the Poisson is always better or equal to the Gaussian when analyzing frequency of random events, I was wondering why the Gaussian distribution would ever be used for this purpose, but Dr. Koch and MATLAB helped me understand that the Poisson is difficult to use for high frequencies since it is not continuous and since the factorial provides some computational challenges.

For the uncertainty of my mean, I will use

[math]\displaystyle{ a=a_{best\ guess}\pm SE }[/math]

such that

[math]\displaystyle{ SE=standard\ error\ of\ the\ mean=\frac{\sigma}{\sqrt{number\ of\ bins}} }[/math].

10ms

[math]\displaystyle{ a }[/math] =0.07666[math]\displaystyle{ \pm }[/math]0.00527
[math]\displaystyle{ \sigma }[/math] =0.3373
[math]\displaystyle{ \sigma_{Poisson} }[/math] =
[math]\displaystyle{ Error_{relative} }[/math] =

100ms

[math]\displaystyle{ a }[/math] =0.6775[math]\displaystyle{ \pm }[/math]0.0157
[math]\displaystyle{ \sigma }[/math] =1.006

1s

[math]\displaystyle{ a }[/math] =6.766[math]\displaystyle{ \pm }[/math]0.191
[math]\displaystyle{ \sigma }[/math] =3.063

10s

[math]\displaystyle{ a }[/math] =69.23[math]\displaystyle{ \pm }[/math]0.69
[math]\displaystyle{ \sigma }[/math] =11.07

100s

[math]\displaystyle{ a }[/math] =
[math]\displaystyle{ \sigma }[/math] =

@@ Line 52: / Line 52: @@
 For dwell times 10ms, 100ms, and 100s, we had 4096 bins.  For dwell times 1s and 10s, we only had 256 bins since more bins would have taken more time than we had (we could do 4096 bins for the 100s dwell time since we let the experiment run over the weekend).  More bins is better since more bins will "smooth out the bumps" on the probability distribution from the data according the the law of large numbers (the formula for standard error of the mean also reveals that more bins gives smaller error).
-==Results==
+==Data and Results==
 When counting random events, the uncertainty of one frequency DATUM is √(frequency).  A similar statement is that the standard deviation (<math>\sigma</math>) of a group of data can be approximated by √(a), where <math>a</math> is the mean.  In fact, for a Poisson distribution,

Physics307L:People/Knockel/Notebook/071010: Difference between revisions

Revision as of 23:50, 19 October 2007

Contents

Poisson Statistics

Goal

Theory

Poisson Distribution

Gaussian Distribution

Equipment

Our setup

Procedure

Data and Results

10ms

100ms

1s

10s

100s

Fitting Gaussian and Poisson distributions to the data

Conclusion

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools