User:Alexander T. J. Barron/Notebook/PHYC 307L Lab Notebook/Poisson notes

From OpenWetWare
Revision as of 15:36, 26 October 2008 by Alexander T. J. Barron (talk | contribs) (→‎Fitting with Gaussian)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Poisson Statistics Lab

Partner: Justin Muehlmeyer


Set Up

SJK 09:45, 20 October 2008 (EDT)

09:45, 20 October 2008 (EDT)
Good methods and equipment

Searching for the cosmic radiation signals:

Using a Tektronix 485 Oscilloscope (analog) and ORTEC model # 266 PM (photo-multiplier) base.

Dynode output on PMT (photo-multiplier tube) to oscilloscope Input Voltage -1500 V PMT outside of lead brick cave

Oscilloscope settings:

  • 5 mV/division
  • 1 ms/division
  • AC triggering: negative slope (internal)

We observe occasional peaks per second that are distinguishable from noise. They change so fast that we can't actually measure the amplitude of their peaks. This is our presumed cosmic radiation.

Using an amplifier:

  • We addded a ORTEC amplifier to our board, and connected the output of the diode to the input of the amplifyer. We wanted to test the signal coming out of the amplifier with the oscilloscope before connecting it into the computer. We ran into many difficulties even finding our cosmic radiation signal with the amplifier, and then again while bypassing it.

We had extensive difficulties in trying to get a signal of the cosmic radiation using the amplifier.


Troubleshooting:

New amplifier: Canberra PAD 814.

Switched from dynode to the anode on the PMT, with no effect.

Switched to new PMT - Tracor Northern Model # TN1222. Way clearer, greater magnitude cosmic radiation spikes with just the oscilloscope. Input ~-1400 V. Positive V output.

Still no results from the amplifier, so...

New Oscilloscope: Tektronix TDS 1002 (digital).

Experiment

"Derivative signal" from the amplifier.

Set up: PMT to PREAMP IN, PREAMP switch to IN, AMP OUT to oscilloscope (DC coupling, rising slope, 36.0 mV threshold)


Hooking Multichannel Analyzer Card (MCA) up to the software:

Software: PCA3 in MSDOS

Bins/Passes: 512

We decided to first take our dwell time to be 1 second.

Other passes: dwell time of 1, 10, 100 milliseconds, and finally 2 seconds.

Saved to ASCII files individually-- the .txt files are here:


Data Analysis

Overview

The goal of this lab is to see whether the frequencies of cosmic ray interactions with a scintillator over several different dwell times follows a Poisson distribution. They should by definition, assuming that the average rate of cosmic ray detection does not change over time. To test this hypothesis, I will first create separate histograms of event counts, calculate the mean of each, and use said means to fit a Poisson distribution over the histogram for each dwell time. Through suggestion from Dr. Koch, I will compare vs. a gaussian function for each case as well.


Results

Here are the histograms:

Data Distributions for Varying Dwell Time


Here is the MATLAB code:



%% Poisson Distribution Analysis

% Alexander Barron
% Junior Lab Fall 2008

close all, clear all;

%% Load .txt files

M1ms_bin = dlmread('JAND1MS.txt',',',[0 0 511 0]);  % bin vector 1ms dwell
M1ms_val = dlmread('JAND1MS.txt',',',[0 1 511 1]);  % value vector 1ms dwell

M10ms_bin = dlmread('JAND10MS.txt',',',[0 0 511 0]);
M10ms_val = dlmread('JAND10MS.txt',',',[0 1 511 1]);

M100ms_bin = dlmread('JAND100M.txt',',',[0 0 511 0]);
M100ms_val = dlmread('JAND100M.txt',',',[0 1 511 1]);

M1s_bin = dlmread('JAND1S.txt',',',[0 0 511 0]);
M1s_val = dlmread('JAND1S.txt',',',[0 1 511 1]);

M2s_bin = dlmread('JAND2S.txt',',',[0 0 511 0]);
M2s_val = dlmread('JAND2S.txt',',',[0 1 511 1]);


%% sort into frequencies for histograms

d1ms_count = zeros(1,50);       % Here I am creating empty
d10ms_count = zeros(1,50);      % vectors in which to put  
d100ms_count = zeros(1,50);     % my manipulated data.
d1s_count = zeros(1,50);
d2s_count = zeros(1,50);

for c=1:5;

    switch c;
        case 1;
            valmat = M1ms_val;          % generic value and
            countmat = d1ms_count;      % count matrices
        case 2;
            valmat = M10ms_val;
            countmat = d10ms_count;
        case 3;
            valmat = M100ms_val;
            countmat = d100ms_count;
        case 4;
            valmat = M1s_val;
            countmat = d1s_count;
        case 5;
            valmat = M2s_val;
            countmat = d2s_count;
    end;
    
    for j=1:512;                        % Here I scan my data for  
                                        % desired frequencies                
        for k=0:50;                     % and place them in 
                                        % generic matrices.
            if valmat(j) == k;
                countmat(k+1) = countmat(k+1) + 1;
            end

        end

    end

    switch c;
        case 1;
            d1ms_count = countmat;      % transformation of
        case 2;                         % generic matrices
            d10ms_count = countmat;     % into appropriately-
        case 3;                         % named ones
            d100ms_count = countmat;
        case 4;
            d1s_count = countmat;
        case 5;
            d2s_count = countmat;
    end;
    
    clear valmat, clear countmat;
    
end;


%% Plot

freq = linspace(0,49,50);

scrsz = get(0,'ScreenSize');
figure('Position',[1 scrsz(4)/1.5 scrsz(3)/1.25 scrsz(4)/1.4]);

for f=1:5;
    
    switch f;
        case 1;
            data = d1ms_count;
            titl = '1 ms Dwell Time';
        case 2;
            data = d10ms_count;
            titl = '10 ms Dwell Time';
        case 3;
            data = d100ms_count;
            titl = '100 ms Dwell Time';
        case 4;
            data = d1s_count;
            titl = '1 s Dwell Time';
        case 5;
            data = d2s_count;
            titl = '2 s Dwell Time';
    end;
    
    subplot(1,5,f), bar(freq,data,'c');
    ylim([0 525]);
    xlim([-2 50]);
    set(gca,'XTick',[0 10 20 30 40 50]);
    ylabel('Frequency of Events');
    xlabel('Events');
    title(titl);
    
    clear data;
    
end;
    

As we increase the dwell time, each histogram begins to look more like a Gaussian function. Perhaps at very large dwell times, a gaussian will fit the data perfectly. It would have been nice to take more data to test this... maybe as my final lab report...


Calculating the Mean

A good method of determining the mean can be found from the Thermodynamics class notes of Dr. Geremia which I post here:

File:Ps1notes.pdf

Where we determine our mean λ to be the rate multiplied by the time. This comes from the definition of the mean, which is the summation of the (# of events) X (Probability of those events) over all values. Geremia shows that when simplified, this summation becomes (average rate per dwell time) X ( dwell time).

I will use this formula to calculate the average for each dwell time:


[math]\displaystyle{ \widehat{\lambda}=\frac{1}{n}\sum_{i=1}^n k_i. \! }[/math]


We took 512 passes per dwell time. ki is the count per bin from the MCA data. Add this code to calculate the mean per dwell time:


%% Calculate Mean (lambda)

n = 512;

format long;
d1mslambda = 1/n * sum(M1ms_val)
d10mslambda = 1/n * sum(M10ms_val)
d100mslambda = 1/n * sum(M100ms_val)
d1slambda = 1/n * sum(M1s_val)
d2slambda = 1/n * sum(M2s_val)


Dwell Time Δt Frequency Count λ
1 ms λ = 0.013671875000000
10 ms λ = 0.126953125000000
100 ms λ = 1.294921875000000
1 s λ = 13.488281250000000
2 s λ = 25.361328125000000


This makes sense, as our dwell time increases, our average number of events increases. These values for λ seem to roughly match the means for the above histograms. To make a probability density bar graph, just divide each histogram bin by the number of iterations.

Add this code at the end of the last block:


%% Transform to Probability Density Graph

probd1ms_count = d1ms_count./512;      
probd10ms_count = d10ms_count./512;  
probd100ms_count = d100ms_count./512;  
probd1s_count = d1s_count./512; 
probd2s_count = d2s_count./512;


%% Plot Probability Density Graph

scrsz = get(0,'ScreenSize');
figure('Position',[1 scrsz(4)/1.5 scrsz(3)/1.25 scrsz(4)/1.4]);

for f=1:5;
    
    switch f;
        case 1;
            data = probd1ms_count;
            titl = '1 ms Dwell Time';
        case 2;
            data = probd10ms_count;
            titl = '10 ms Dwell Time';
        case 3;
            data = probd100ms_count;
            titl = '100 ms Dwell Time';
        case 4;
            data = probd1s_count;
            titl = '1 s Dwell Time';
        case 5;
            data = probd2s_count;
            titl = '2 s Dwell Time';
    end;
    
    subplot(1,5,f), bar(freq,data,'c');
    ylim([0 1]);
    xlim([-2 50]);
    set(gca,'XTick',[0 10 20 30 40 50]);
    ylabel('Frequency of Events');
    xlabel('Events');
    title(titl);
    
    clear data;
    
end;


Result:


Probability Data Distributions for Varying Dwell Time


I tested to see if these functions integrated to 1. They do. They are, indeed, probability density graphs.

Now for the error on lambda. I got this formula for the error on lambda from Tomas Mondragon's notebook entry last year:

[math]\displaystyle{ f(k;\lambda)=\frac{\lambda^k e^{-\lambda}}{k!} }[/math]


[math]\displaystyle{ Error=\frac{\sqrt{\sum_{k=\operatorname{min}\ k}^{\operatorname{max}\ k}\left(f(k,\lambda)-\frac{x_k}{N} \right)^2}}{\sqrt{N}\sqrt{N-1}} }[/math]

Where the frequency count of an event count [math]\displaystyle{ k }[/math] is [math]\displaystyle{ x_k }[/math]. N is 512 in this case.


So, once again, tack this code on the end of the MATLAB m-file:



%%  Error on Lambda

for j=1:5;
    
    switch j;
        case 1;
            lambda = d1mslambda;
            count = d1ms_count;
            t0 = 0;                 % minimum event count
            tf = 1;                 % maximum event count
        case 2;
            lambda = d1mslambda;
            count = d10ms_count;
            t0 = 0;
            tf = 2;
        case 3;
            lambda = d100mslambda;
            count = d100ms_count;
            t0 = 0;
            tf = 4;
        case 4;
            lambda = d1slambda;
            count = d1s_count;
            t0 = 7;
            tf = 20;
        case 5;
            lambda = d2slambda;
            count = d2s_count;
            t0 = 17;
            tf = 34;
    end;

    errorsum = 0;

    for t=t0:tf;

        errorsum = errorsum + ((lambda^t*exp(-lambda)/factorial(t) - count(t+1)/512))^2;

    end;
    
    error = ((errorsum))^.5/(512*511)^.5;
    
    
    switch j;
        case 1;
            d1ms_lerror = error
        case 2;
            d10ms_lerror = error
        case 3;
            d100ms_lerror = error
        case 4;
            d1s_lerror = error
        case 5;
            d2s_lerror = error
    end;
    
    clear error, clear errorsum, clear count, clear t0, clear tf;
    
end;


So now my full results with error are here:


Dwell Time Δt Frequency Count λ λerror
1 ms λ = 0.013671875000000 4.059741317917446e-07
10 ms λ = 0.126953125000000 2.975294357739427e-04
100 ms λ = 1.294921875000000 4.620595463752644e-04
1 s λ = 13.488281250000000 3.259048960726763e-04
2 s λ = 25.361328125000000 2.425961354924050e-04


This error seems plausible.SJK 09:20, 20 October 2008 (EDT)

09:20, 20 October 2008 (EDT)
I haven't fully "re-understood" what Tomas was doing last year. But I believe his "error" was an un-normalized parameter of how good his Poisson fit was...not and "uncertainty." Your uncertainty is actually much easier to calculate. It's what we were talking about on the chalkboard: you count a total number of events and then divide by number of bins to get your best fit lambda (SUM/N). The error in this is sqrt(SUM)/N.

I disagree with you that your uncertainty seems plausible, though I do agree it's better than E-19. I don't think 25.3613 +/- 0.0002 is reasonable for how much data you took

Originally I obtained error values on the order of 10E-19 , due to a coding error. In addition, I had originally thought to sum from minimum to maximum count number then square in the denominator of the error function, but I believe one actually squares each term before summing.

Fitting with Poisson

Here I use the lambda values to create Poisson plots over my data distributions. I also created a "range" overlay of Poisson to contrast the extremes in error I found, but the plots with error accounted for are indistinguishable from those without (at least at this scope). It appears that the Poisson distribution is pretty spot-on for 1ms and 10ms dwell, then matches the mean of the data without modeling its true shape. The data range of counts is decidedly narrower than the Poisson range at 1-2s.


Probability Data Distributions for Varying Dwell Time vs. Poisson Distributions


Here is the code:


%% Fitting with Poisson

figure('Position',[1 scrsz(4)/1.5 scrsz(3)/1.25 scrsz(4)/1.4]);

for j=1:5;
    
    switch j;
        case 1;
            lambda = d1mslambda;    % <- here I define simple lambda w/out error                                              
        case 2;
            lambda = d10mslambda;
        case 3;
            lambda = d100mslambda;
        case 4;
            lambda = d1slambda;
        case 5;
            lambda = d2slambda;     
    end;
    
    for t=1:50;
        Poissonvec(t) = lambda^(t-1)*exp(-lambda)/factorial(t-1);
    end;
    
    switch j;
        case 1;
            Pd1msvec = Poissonvec;  
            data = probd1ms_count;
            titl = '1 ms Dwell Time';
        case 2;
            Pd10msvec = Poissonvec;
            data = probd10ms_count;
            titl = '10 ms Dwell Time';
        case 3;
            Pd100msvec = Poissonvec;
            data = probd100ms_count;
            titl = '100 ms Dwell Time';
        case 4;
            Pd1svec = Poissonvec;
            data = probd1s_count;
            titl = '1 s Dwell Time';
        case 5;
            Pd2svec = Poissonvec;  
            data = probd2s_count;
            titl = '2 s Dwell Time';
    end;
    
    subplot(1,5,j), bar(freq,data,'c'), hold on, plot(freq,Poissonvec,'.r');
    ylim([0 1]);
    xlim([-2 50]);
    set(gca,'XTick',[0 10 20 30 40 50]);
    legend('Data','Poisson');
    ylabel('Frequency of Events');
    xlabel('Events');
    title(titl);
    
    clear lambda, clear Poissonvec, clear data, clear titl;
    
end;


Fitting with Gaussian

This is the Gaussian Function, according to the Normal Distribution Wikipedia Article:

[math]\displaystyle{ \varphi_{\mu,\sigma^2}(x) = \frac{1}{\sigma\sqrt{2\pi}} \,e^{ -\frac{(x- \mu)^2}{2\sigma^2}} = \frac{1}{\sigma} \varphi\left(\frac{x - \mu}{\sigma}\right),\quad x\in\mathbb{R}, }[/math],

where [math]\displaystyle{ \sigma }[/math] is the standard deviation, and [math]\displaystyle{ \mu }[/math] is the expected value, or mean.

This is the relationship between the standard deviation and the mean, according to the Standard Deviation Wikipedia Article:

[math]\displaystyle{ \sigma(r) = \sqrt{\frac{1}{N-1} \sum_{i=1}^N (x_i - r)^2}. }[/math]

  • Alexander T. J. Barron 18:36, 26 October 2008 (EDT): OK, looking back, I don't know what I was doing. Above is the sample standard deviation. The standard error would be the sample standard dev. divided by root N.


I tried fitting with the Gaussian function manually, as before with Poisson, but it was a disaster. I obtained these results with the Distribution Fitting Tool in MATLAB:


Gaussian Fit


The tool obviously uses different bin widths and count ranges from mine when generating its histograms, but it's easy to see the same trend as I obtained before. Clearly, the Gaussian is a horrible fit for low dwell time. Now I'll fit both Gaussian and Poisson together with the tool:

SJK 09:24, 20 October 2008 (EDT)

09:24, 20 October 2008 (EDT)
These graphs look really great, and a great way of presenting things! (I just think there's a problem w/ Poisson somehow)
Gaussian vs. Poisson Fit

SJK 09:36, 20 October 2008 (EDT)

09:36, 20 October 2008 (EDT)
It's important to report the resulting fit parameters, too!

This time I centered the histogram bins on integers in order to compare to my previous plots more effectively. It looks like I nailed the manual implementation of the Poisson function, so all that code was NOT in vain. Fabulous. This set of plots clearly shows how the Poisson distribution describes the experiment very well for low dwell, but degrades with increasing dwell. The Gaussian does the exact opposite. I'm not sure whether the Poisson really turns into a Gaussian... it looks like it does, but not the optimum kind for fitting this data set. As a side note, changing the binning options affected the Gaussian fit drastically. I'll have to keep this in mind in later experiments.SJK 09:14, 20 October 2008 (EDT)

09:14, 20 October 2008 (EDT)
You're right in your observation that the Guassian fits much better...but this isn't supposed to be the case! The Poisson and Guassian should be much closer for the higher values...basically your Poissan distribution is way too wide for some reason. So far I can't see why. If your Poisson were too narrow, I'd just say the data had extra spread and drift. But if it really is a Poisson process, then it's impossible for the data to have less spread that they should!