Difference between revisions of "Holcombe:Statistics"
(5 intermediate revisions by the same user not shown)  
Line 5:  Line 5:  
[http://psychology.uwo.ca/JodyCulham/Courses/ErrorBars_Lecture.ppt Jody Culham error bars lecture]  [http://psychology.uwo.ca/JodyCulham/Courses/ErrorBars_Lecture.ppt Jody Culham error bars lecture]  
+  "Rule of thumb for 95% CIs:  
+  If the overlap is about half of one onesided error bar, the difference is significant at ~ p < .05  
+  If the error bars just abut, the difference is significant at ~ p< .01  
+  works if n >= 10 and error bars don’t differ by more than a factor of 2  
+  "  
"If events are dependent (whether causal or not), the aggregate is not going to be Gaussian. " why?  "If events are dependent (whether causal or not), the aggregate is not going to be Gaussian. " why?  
+  
+  the sum of two independent random variables is distributed according to the convolution of their individual distributions  
==Fitting curves to data==  ==Fitting curves to data==  
Line 13:  Line 20:  
*MATLAB is sometimes used  *MATLAB is sometimes used  
*Using MacCurveFit for OS9; rarely used  *Using MacCurveFit for OS9; rarely used  
+  ==Bootstrapping==  
+  how I [[Holcombe:fit psychometric functions]] and bootstrap  
+  [http://www.uvm.edu/~dhowell/StatPages/Resampling/BootstMeans/bootstrapping_means.html Howell's] pages  
+  
== ==  == ==  
[[Holcombe:CircularStatisticsCircular Statistics]]  [[Holcombe:CircularStatisticsCircular Statistics]]  
+  ==Effect size rant==  
+  in psychology training and in psychology journals, you may have heard of formulas for "effect size" like Cohen's d. An issue that came up in the lab yesterday provoked me into a long rant about this, which might help some of you if you ever wonder whether you should be calculating it:  
+  
+  If you look up "effect size" in a psychology context, you'll find stuff like Cohen's d. Cohen's d was invented for areas of psychology where the actual raw measure of the size of the effect—in our case the mean number of degrees per cycle of the error, or the ms of the lag— doesn't mean anything, so they had to invent a measure where they scale their relatively meaningless raw measure by its variability within conditions. This can make sense in paradigms like an implicit association test , where the raw measure is number of milliseconds difference between two conditions. Here noone actually knows what it means if the brain is 5 ms slower in doing something or 10 ms slower in doing something. So they want to divide it by some measure of how much bigger it is than random fluctuations. So if you get something like a Cohen's d of 10, you can say wow , that's ten times bigger than the random fluctuations you get.  
+  Fortunately in vision science our numbers are more meaningful in that having a lag of a certain number of milliseconds means we are that many milliseconds behind the visual world. The errors are similar. It has a more definite meaning than the differences in response time of two arbitrary tasks. We have the luxury of comparing these numbers across experiments directly.  
+  Researchers studying other things aren't that lucky if they did three experiments, in varying the task slightly from one experiment to the next , the baseline of their dependent measure might change a lot, like if the number of choices in a task increased, the baseline response time could increase a lot. Then, the response time difference in the two experiments could not be directly compared because in elevating the baseline, the difference you get might change (for instance, if the relationship between the underlying mental quantity and RT was not linear), so then it would be better to compare something like Cohen's d, in the hope that dividing by the uncontrolled fluctuation would account for the consequences of the baseline change. 
Latest revision as of 18:08, 26 January 2011
Members• Alex Holcombe

Projects• Testing Booth Calendar 

Technical• Skills Checklist 
Other• Plots,Graphs

The picturing of data allows us to be sensitive not only to the multiple hypotheses that we hold, but to the many more we have not yet thought of, regard as unlikely, or think impossible  Tukey, 1974
The great fun of information visualization is that it gives you answers to questions you didn’t know you had  Ben Shneiderman
Jody Culham error bars lecture "Rule of thumb for 95% CIs: If the overlap is about half of one onesided error bar, the difference is significant at ~ p < .05 If the error bars just abut, the difference is significant at ~ p< .01 works if n >= 10 and error bars don’t differ by more than a factor of 2 "
"If events are dependent (whether causal or not), the aggregate is not going to be Gaussian. " why?
the sum of two independent random variables is distributed according to the convolution of their individual distributions
Fitting curves to data
 R is often used in the lab
 Python alone and with SciPy can be used easily, example here
 MATLAB is sometimes used
 Using MacCurveFit for OS9; rarely used
Bootstrapping
how I Holcombe:fit psychometric functions and bootstrap Howell's pages
Effect size rant
in psychology training and in psychology journals, you may have heard of formulas for "effect size" like Cohen's d. An issue that came up in the lab yesterday provoked me into a long rant about this, which might help some of you if you ever wonder whether you should be calculating it:
If you look up "effect size" in a psychology context, you'll find stuff like Cohen's d. Cohen's d was invented for areas of psychology where the actual raw measure of the size of the effect—in our case the mean number of degrees per cycle of the error, or the ms of the lag— doesn't mean anything, so they had to invent a measure where they scale their relatively meaningless raw measure by its variability within conditions. This can make sense in paradigms like an implicit association test , where the raw measure is number of milliseconds difference between two conditions. Here noone actually knows what it means if the brain is 5 ms slower in doing something or 10 ms slower in doing something. So they want to divide it by some measure of how much bigger it is than random fluctuations. So if you get something like a Cohen's d of 10, you can say wow , that's ten times bigger than the random fluctuations you get. Fortunately in vision science our numbers are more meaningful in that having a lag of a certain number of milliseconds means we are that many milliseconds behind the visual world. The errors are similar. It has a more definite meaning than the differences in response time of two arbitrary tasks. We have the luxury of comparing these numbers across experiments directly. Researchers studying other things aren't that lucky if they did three experiments, in varying the task slightly from one experiment to the next , the baseline of their dependent measure might change a lot, like if the number of choices in a task increased, the baseline response time could increase a lot. Then, the response time difference in the two experiments could not be directly compared because in elevating the baseline, the difference you get might change (for instance, if the relationship between the underlying mental quantity and RT was not linear), so then it would be better to compare something like Cohen's d, in the hope that dividing by the uncontrolled fluctuation would account for the consequences of the baseline change.